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Model Independent Search For New Physics At The 

Tevatron 

by 

Georgios Choudalakis 

Submitted to the Department of Physics 
on April 24, 2008, in partial fulfillment of the 
requirements for the degree of 
Doctor of Philosophy in Physics 

Abstract 

The Standard Model of elementary particles can not be the final theory. There are 
theoretical reasons to expect the appearance of new physics, possibly at the energy 
scale of few TeV. Several possible theories of new physics have been proposed, each 
with unknown probability to be confirmed. Instead of arbitrarily choosing to examine 
one of those theories, this thesis is about searching for any sign of new physics in a 
model-independent way. This search is performed at the Collider Detector at Fermilab 
(CDF). 

The Standard Model prediction is implemented in all final states simultaneously, 
and an array of statistical probes is employed to search for significant discrepancies 
between data and prediction. The probes are sensitive to overall population discrep- 
ancies, shape disagreements in distributions of kinematic quantities of final particles, 
excesses of events of large total transverse momentum, and local excesses of data 
expected from resonances due to new massive particles. 

The result of this search, first in 1 fb~^ and then in 2 fb~^, is null, namely no 
considerable evidence of new physics was found. 

Thesis Supervisor: Peter Fisher 
Title: Professor of Physics 
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Chapter 1 
Introduction 



1.1 The Standard Model 

Our current understanding of nature on its most fundamental level is encoded in the 
"Standard Model" of elementary particles. 

The building blocks of matter are categorized into three families of fermions and 
four gauge bosons, shown in Fig. 1-1. 

The Standard Model is a local gauge invariant quantum field theory, which de- 
scribes electromagnetic, weak and strong interactions. Interactions are introduced for 
free with the assumption that nature is symmetric under local gauge transformations 
of the U{1)y X SU{2)l x SU{?>)c group [1]. Electromagnetic and weak interactions 
are aspects of a unified electroweak interaction, which are distinguishable in result 
of electroweak symmetry breaking via the Higgs mechanism. Elementary particles 
acquire bare mass by coupling to the same Higgs field that is responsible for the elec- 
troweak symmetry breaking. The success of this model of electroweak interactions in 
describing experimental data from the last 35 years builds confidence in the existence 
of the Higgs boson, though it has not been directly observed as of today. 

The Standard Model carries 26 free parameters, which are determined experi- 
mentally. Depending on how one counts, they are the 6 lepton masses, the 6 quark 
masses, 4 parameters from CKM plus 4 from PMNS matrix, the strong coupling a^, 
the QCD angle 9qcd, the electromagnetic coupling a, Weinberg angle 9^,, the vacuum 
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Figure 1-1: Elementary particles in the Standard Model. 



expectation value (v) and the mass (m/^) of the Higgs. 

The success of the Standard Model is certainly among the greatest achievements 
in physics. At the same time, it is bound to not be the final theory. Some reasons 
are explained in Section 1.1.1. 



1.1.1 Limitations 

The most obvious shortcoming of the Standard Model, as it stands, is that it does not 
describe gravity [2, 3]. Its domain is limited to energies much smaller than Planck 
mass [Mpi), where from dimensional analysis gravity is expected to be comparable 
to the other three known interactions. 

Another nuisance is the presence of 26 free parameters. Past successful theories 
have established in our minds some notion of scientific aesthetics, according to which 
the fundamental theory should be able to derive, from first principles, numbers such 
as the mass of the electron, or the amount of CP violation observed in systems like 

and mesons. Otherwise one can not claim to understand those effects. Grand 
Unification Theories try to address these issues by embedding the Standard Model 
into larger symmetry groups (Sec. 1.2.1). 

There is overwhelming evidence (from observations of the cosmic microwave back- 
ground radiation, galaxy rotations, gravitational lensing, spectroscopy of clusters and 
super-novae) that dark matter and dark energy dominate the mass-energy density of 
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the universe [4]. Currently, the Standard Model fails to provide a good candidate for 
either. 

Another puzzle is the so-called "hierarchy problem" , namely why the electroweak 
symmetry is broken at energy < 1 TeV, so much smaller than Mpi, where grav- 
ity becomes significant. Theories involving extra dimensions propose some answers 
(Sec. 1.2.3). 

Related to hierarchy is the the problem of "naturalness" in the Standard Model. 
A small parameter in a theory is "natural" when setting it to zero increases some 
symmetry of the theory, therefore its smallness can be attributed to that very sym- 
metry. For instance, the masslessness of a vector field such as the photon can be 
related to the gauge invariance of the theory. However, for a scalar field, such as 
the Standard Model Higgs, no symmetry is there to protect its mass from acquiring 
quadratically divergent corrections at the loop level (Fig. 1-3), unless the theory is 
highly fine-tuned (Fig. 1-2). The required precision of fine-tuning depends on how far 
one wishes to extend the validity of the Standard Model. If one wishes it account for 
loop corrections up to the Planck scale, while keeping the Higgs lighter than 1 TeV, 
as required by electroweak measurements, then the required fine-tuning is so precise 
that it seems unnatural (hence the connection between naturalness and hierarchy). 
A solution to this can be either to abandon the concept of fundamental scalars, as in 
technicolor models (Sec. 1.2.4), or to search for a theory where quadratic divergences 
cancel, as in Supersymmetry (Sec. 1.2.2). 

1.2 Beyond the Standard Model 

Let me summarize the main proposals which address the limitations explained in 
Sec. 1.1.1, and what observable implications each suggests. 

1.2.1 Grand Unification 

The motivation behind Grand Unification Theories (GUTs) [6, 7] are questions such 
as "why protons and electrons have exactly opposite charge", or "why have three 
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Figure 1-3: Quantum corrections to the Higgs rn^, through fermion loops (a) and 
Higgs's self-coupling (b). 
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Figure 1-4: Diagram leading to proton decay in the context of SU{5) Grand Unifica- 
tion. 

generations of fermions and three interactions". These questions could become less 
thorny if instead of many we had just one symmetry group, which would make all 
particles look like components of just one particle, and all interactions like aspects 
of one force. Such a theory wouldn't only satisfy common taste, but more impor- 
tantly could derive from mathematical principles the values of some constants, such 
as sin6'^„, which would be a significant advancement in our understanding nature from 
a reductionist's point of view. 

Several Lie algebras have been studied; notably SU{5), 50(10), Eq and more 
[2, 3]. Phenomenology varies significantly depending on the assummed symmetry. 
An effect predicted typically is proton decay, as new gauge bosons such as the one 
in Fig. 1.2.1, are predicted in breaking these hyper-symmetries at some large energy, 
typically Mqut ^ 10^^ GeV. 

1.2.2 Supersymmetry 

Supersymmetric theories take the approach of solving the problem of naturalness 
(Sec. 1.1.1), by having a bosonic loop for each fermionic one, thus canceling out the 
quadratically divergent loop corrections. 

SUSY introduces boson partners to Standard Model fermions, and fermion part- 
ners to gauge bosons. It introduces operators which transform fields into "super- 
partners" which differ from the original particles by half a unit of spin [8]. The 
superpartners of gauge bosons are called "gauginos" , those of leptons "sleptons" and 
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Table 1.1: Ordinary particles and their superpartners. 

those of quarks "squarks" (Table 1.1). 

SUSY can have additional favorable features, which increase interest in it. With 
the extra assumption of a conserved multiplicative quantum number (R-parity), which 
is +1 for ordinary particles and -1 for superpartners, the lightest superpartner be- 
comes stable, serving as a cold dark matter candidate [9]. Furthermore, a theory of 
local supersymmetry should lead to invariance under general coordinate transforma- 
tions, which may be the road to incorporating General Relativity into the Standard 
Model. Finally, SUSY can affect the running of couplings to make them exactly equal 
at some energy, in compliance with Grand Unification Theories. 

If supersymmetry were exact, then each Standard Model particle would have a 
superpartner of equal mass. Since this is not observed, SUSY has to be broken at 
some energy scale [3]. It is non-trivial to construct models where SUSY is broken 
in ways that avoid contradicting observation, and simultaneously do not destroy its 
desirable features. 

Higgs mass is predicted to be of order 10^ GeV/c^, so for SUSY to secure it from 
divergences it has to be introduced at energy < 1 TeV. That happens to be also the 
energy scale where it needs to be introduced in order to equalize couplings at the 
scale of 10^^ GeV, associated with Grand Unification. These elements hint that, if 
SUSY is a correct theory, it may be within reach for current experiments. 

Most SUSY signatures involve large missing energy accompanied by multiple lep- 
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tons and jets. Missing energy would be the effect of stable and elusive superpartners, 
while jets and leptons would result from long decay chains of unstable ones. 



1.2.3 Extra Dimensions 

Theories of extra dimensions are motivated by the hierarchy problem. 

One hypothesis is that of large extra dimensions, where the known 4 dimensions, 
i.e. our "brane" , are embedded in a manifold of higher dimensionality, and gravity 
only appears to be feeble because part of it is projected onto our brane, while the 
rest propagates in the extra dimensions, often referred to as "the bulk". By adjusting 
the number of extra dimensions and their radius of curvature, one can make gravity 
appear significant at Mpi and still lower its natural scale down to the electroweak 
scale [10]. 

Theories with universal extra dimensions exist too, where fermions and/or gauge 
bosons also propagate in the bulk [11]. 

Other theories assume wrapped extra dimensions. Hierarchy then emerges by 
exploiting the metric of the bulk space itself. For example, with one wrapped extra 
dimension periodically bounded by two 3-dimensional branes, Einstein's equations 
result in an anti de Sitter metric, whose exponential factor makes gravity appear 
feeble on one of the 3-branes, where the Standard Model fields are supposed to be 
confined [12]. 

If at small distances gravity is not as feeble as suggested macroscopically by Mpi, 
then collider experiments could reveal the coupling of gravitons. For example, a 
signature could be pp gGn, i.e. mono-jet events with large missing energy due to 
the graviton Gn escaping in the bulk (Fig. 1-5). Another signature of the graviton 
could be the Standard-Model-forbidden gg Gn £^i^ [3]. In the case of universal 
extra dimensions one may observe the Kaluza-Klein higher states of fermions and 
bosons, through Z' tt for instance. 
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Figure 1-5: Possible signatures of graviton. 
1.2.4 Technicolor 

An alternative approach to electroweak symmetry breaking, which avoids the intro- 
duction of fundamental scalar fields, is new strong dynamics. With the introduction 
of a new non-abelian gauge symmetry and additional fermions {^^technifermions") 
which have this new interaction, it becomes possible to form a technifermion con- 
densate that can break the chiral symmetry of fermions, in a way analogous to QCD 
where the qq condensate breaks the approximate SU(2) x SU(2) symmetry down 
to SU{2)isospin- The breaking of global chiral symmetries implies the existence of 
Goldstone bosons, the "technipions" (vtt), in analogy with QCD pions. Three of the 
Goldstone bosons are absorbed through the Higgs mechanism to become the longi- 
tudinal components of the W and Z, which then acquire mass proportional to the 
technipion decay constant. 

Experimental signatures of technicolor are model dependent. For example, they 
can be the resonance of a Standard Model gauge boson into an excited technivector 
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(a) 



(b) 



Figure 1-6: (a) Contact interaction allowed in the case of compositeness. (b) Tree- level 
SM diagram with the same initial and final state, where the interaction is mediated 
by a gauge boson. 

meson, like a technirho (pt), which subsequently decays into W and ttt, with ttt 
possibly decaying to regular quarks [3]. For example, assuming that ttt couples 
preferably to the third generation, such a process could be — > W^ir^ — i'^ubb, or 



1.2.5 Compositeness 

Compositeness is the idea that the Higgs and possibly other bosons and fermions 
contain substructure. Compositeness addresses the problem or naturalness similarly 
with technicolor, namely by avoiding the assumption of a fundamental scalar particle. 

If quarks and leptons are not elementary, then they are predicted to have excited 
states {q*,i*). For example, excited leptons could appear via i* — > £7 or i* Wv.. 

More importantly, if quarks and leptons have structure, new interactions should 
appear between them at the energy scale of their binding energy. They would be 
contact interactions, allowing processes such as — l^C.~ and — * qq to occur 
in ways additional to those of the SM (Fig. 1-6) [13, 3]. 
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1.3 Current standpoint - Motivation 

In 1995, the discovery of the top quark was announced [14], leaving Higgs as the only 
unobserved Standard Model particle. We now enter the Large Hadron Collider (LHC) 
era with some confidence that the Higgs will be observed to complete the Standard 
Model pantheon of particles. At the same time, there is hope that even what has to lie 
beyond the Standard Model will be revealed soon. If such a groundbreaking discovery 
is made, it will be different from the top quark or even a possible Higgs discovery, in 
the sense that it will signify the opening to a new continent of unexplored physics. 

Nature has proven its capacity to surprise us. There are many ideas of what the 
new physics may be, but there is no need for any of them to be right. So, especially in 
this historical time when we expect to overcome the current impasse, it makes sense 
to search for any sign of discrepancy between the data and the Standard Model, 
without introducing any bias in what it may look like. This is the motivation behind 
performing a model-independent and global search. 

Tevatron stands at the current high energy frontier, producing pp collisions at 
energy 1.96 TeV and constantly increasing luminosity. Although the size and reach 
of the Tevatron are inferior to those of LHC, there is still a window of opportunity 
in the former, until the latter has collected data and understood systematic effects 
specific to it. It would be undesirable to discover something at the LHC and then 
look back only to realize that it had been overlooked at the Tevatron. On the other 
hand, performing a global, model-independent analysis of the Tevatron data has the 
potential of revealing evidence of new physics that can be cross-checked at the LHC. 
This hope motivates the present work. 
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Chapter 2 

Experimental apparatus 



The present search for new physics is performed in data collected with Collider De- 
tector at Fermilab (CDF), a general scope detector for particles generated at high 
energy pp collisions produced by the Tevatron accelerator. Tevatron and the Fermi 
National Accelerator Laboratory (FNAL) are shown in Fig. 2-1. 

This chapter describes the production of pp collisions and the CDF detector. For 
the many acronyms used, please consult Appendix D. 

2.1 Beam Production 

Either due to CP violation or some other unknown reason, free protons outnumber 
antiprotons, which makes it easier to obtain the former, and use them to generate the 
latter. In this section, the procedure leading to the production of the p and p beams 
is outlined. 

2.1.1 p Source 

The production starts with storing hydrogen gas (^^2) in a Cockroft- Walton cham- 
ber [15], in which a 750 kV DC voltage causes electric discharges which produce neg- 
ative hydrogen ions {H~). The H" are separated from the rest of the gas by use of 
a magnetic transport system and are channeled to the Linac. 

29 



p A 




Figure 2-1: Sketch of the FNAL accelerator complex. 



The Linac [16] is a 130 m long Alvarez linear accelerator that transfers the H 
from the Cockroft- Walton to the Booster, accelerating them from 750 keV to 400 
MeV. 

The Booster [17] is a 475 m long synchrotron that accelerates the H~ from 400 
MeV to 8 GeV in just 67 ms, hence its name. One Linac load is 40 /is long and 
the rotation period of the beam in the Booster during injection is 2.22 /is, which 
means that in principle it could take ^^^^^^ = 99.9% of the Linac's load in 18 turns. 
Operationally however, only 5 or 6 turns get used for maximum intensity, and the rest 
(66.7%) of the Linac's load is dumped. At the entrance, the H~ ions pass through 
a carbon foil, which strips off the electrons, transforming H~ into , viz. protons. 
It is important that the H~ pass through the carbon foil at their entrance to the 
ring, as they meet with the circulating . This technique, named CEI, allows 
for higher beam brightness, avoiding limitations that would have otherwise followed 
from Liouville's theorem [18]. A full Booster "batch" contains a maximum of 5 x 10^^ 
protons at 8 GeV, coalesced into 84 bunches, ready to be delivered to the Main 
Injector. 
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2.1.2 Main Injector 

The Main Injector [19] is a 3.319 km long non-circular synchrotron, serving not only 
the Tevatron, but also providing protons for the production of the NuMI neutrino 
beam and the proton beam in the Fixed Target area. Its operations that relate to 
the Tevatron are: 

1. p production: A single Booster batch is injected into the MI at 8 GeV. These 
protons are accelerated to 120 GeV and extracted in a single turn for delivery 
to the p production target. The produced antiprotons will eventually return to 
the MI for acceleration to 150 GeV, before they are delivered to the Tevatron. 

2. Collider mode: Accelerate protons or antiprotons to 150 GeV and deliver them 
to the Tevatron. 

3. End of store: Accept 150 GeV antiprotons and decelerate them to 8 GeV for 
storage in the Recycler. 

2.1.3 p Source 

At the p production area, the 120 GeV protons coming from the MI are directed 
onto a nickel target [20]. Before the collision, the bunch undergoes some modulation 
called RF bunch rotation, so as to be shorter in time and, in agreement with Liouville's 
theorem, contain a wider spectrum of momenta. Its being more sudden maximizes 
the phase-space density of antiprotons produced as secondary products of the collision 
with the nickel target. First, the cone of particles produced at the collision is rendered 
parallel by means of a lithium lens [21]. Then, a dipole magnet selects 8 GeV 
antiprotons, as that is the standard MI injection energy, and directs them into the 
Debuncher. 

At the Debuncher [20], which is a "ring" of rounded triangular shape, the 8 GeV 
antiprotons are subjected to a RF bunch rotation, this time in the reverse direction, 
so that their beam contains a narrower spectrum of momenta and, in agreement 
with Liouville's theorem, spans a longer time interval. This reduction in momentum 
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spread is done to improve the Debuncher-to-Accumulator transfer, because of the 
hmited momentum aperture of the Accumulator at injection. The Debuncher makes 
use of the time between MI cycles to reduce the beam transverse size and longitudinal 
momentum spread through betatron and momentum stochastic cooling respectively. 
This further improves the efficiency of the Debuncher-to-Accumulator transfer. 

The Accumulator [20] is a rounded triangular "ring" , similar to the Debuncher. 
The reason for that is that it also applies stochastic cooling to the p beam, which 
requires linear segments along the ring to accommodate pickups and kickers. The 
main purpose of the Accumulator is to hold antiprotons until they are needed by the 
Tevatron. The antiprotons are stored in the Accumulator for hours or days, while they 
augment as more are produced at the nickel target. When a new pulse of antiprotons 
enters the Accumulator, it circulates along a trajectory of greater "radius" than the 
antiprotons that have already been cooled down. The RF decelerates the recently 
injected pulses of antiprotons from the injection energy to the edge of the stack tail. 
The stack tail momentum cooling system sweeps the beam deposited by the RF away 
from the edge of the tail and decelerates it towards the dense portion of the stack, 
known as the core. Additional cooling systems keep the antiprotons in the core at 
the desired momentum and minimize the transverse beam size. 

There is yet another ring, the Recycler [22], which has a role similar to that 
of the Accumulator. It is a 3.3 km long ring along the MI, being therefore much 
longer than the Accumulator, which means that if the Accumulator is getting full it 
can use the Recycler to hold some antiprotons too. Spread over a longer ring, the 
antiprotons in the Recycler are easier to maintain stable, since the beam is less dense 
and the dispersive forces weaker. In addition to being longer, the Recycler employs 
the electron cooling method to reduce the momentum spread of the antiprotons. 
Electron cooling is a more modern technique than stochastic cooling, in which a cold 
(small momentum spread) beam of electrons travels parallel to the hot antiproton 
beam, serving as a heat sink, where the heat of the antiproton beam is dumped, since 
the two beams interact electromagnetically and from thermodynamics it is known 
that heat goes from the hotter system to the cooler. Once the electron beam heats 
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up, it is discarded for a new, cold electron beam to take over. The Recycler does 
not only accept antiprotons that the Accumulator can not hold, but also those that 
the Tevatron does not need any more. Since antiprotons are so hard to produce, 
the Recycler keeps them to be reused in the next "store", hence its name. When 
the stored antiprotons reach adequate quantity, the Tevatron is ready to start pp 
collisions. 

2.1.4 Tevatron 

For over two decades, the Tevatron [23, 3] has been the largest hadron collider, to be 
soon succeeded by the Large Hadron Collider (LHC) at CERN. It is a synchrotron 
accelerator with radius 1 km. Along its ring are 774 dipole and 216 quadrupole 
superconductive magnets, providing magnetic field of intensity 4.4 T. The magnets 
operate in superconductive state, with cooling from liquid helium. 

The Tevatron receives p and p bunches from the MI, where they have been ac- 
celerated from 8 to 150 GeV. The filling takes about 30 minutes, much longer than 
the acceleration period that is only 86 seconds. It accelerates the p and the p beam 
to the energy of 980 GeV, producing head-on collisions at ^/s = 1.96 TeV in the 
reference frame of CDF [3]. The proton and antiproton beams are both separated 
in 3 trains, each containing 12 bunches, therefore there are 36 p and 36 p bunches 
traveling in opposite directions at the same energy. Each bunch is about 18 ns (57 
cm) long, which is the length of one RF bucket^ at the Tevatron. The interval be- 
tween successive bunch crossings is 396 ns (21 buckets), which is of course equal to 
the interval between successive bunches in a train. Successive trains are separated by 
longer (2621 ns or 139 buckets) intervals, called abort gaps. 

Each p and p bunch counts about 24 x 10^'^ and 6 x 10^° particles respectively. 
As of today, the beam's optical properties allow for instantaneous luminosity that is 
over 2 x 10^^ cm~^s~^ at CDF, and about 15% lower at D0 [24, 25]. 

RF bucket is a slot defined by the RF electromagnetic waves, in which a bunch may be 
accommodated. 
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Figure 2-2: Cut-away view of the CDF detector. 

2.2 The CDF detector 

CDF is a ~5,000 ton detector [26] enveloping the BO coUision point of the Tevatron 
(Fit. 2-1). Externally, it looks forward-backward symmetric (Fig 2-2), mostly made 
of steel, of dimensions that are approximately 16 m x 13 m x 13 m. It is underground, 
shielded behind tons of concrete, which keeps it somewhat insulated from environ- 
mental sources of noise and prevents potentially hazardous radiation from leaking 
into its immediate surroundings. A three story building houses in its basement the 
detector and its assembly site, while in the superjacent levels it accommodates the 
data acquisition devices and the Control Room, from where operations are managed. 

The CDF detector allows for a broad range of physics searches, from heavy flavor 
physics to searches of exotic new phenomena. It combines a variety of features, i.e. 
tracking, timing, calorimetry and muon detection systems, all seamed together with 
powerful trigger and DAQ systems. 

By 1996, when the Run I period of Tevatron was over, about 90 pb~^ of data had 
been collected, in which the long-sought t-quark had eventually been discovered [14]. 
In preparation for the even more ambitious Run II era, which started in 2001, CDF 
was decisively upgraded [26], with new tracking and calorimetry capabilities and a 
much more efficient muon detection system. The DAQ system had to be upgraded 
too, to respond to the expected instantaneous luminosity of up to 5 x 10"^^ cm"^s~^. 
In the following sections, the current status of CDF will be described. 
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Figure 2-3: Transverse section of half of the CDF detector in Run II. 
2.2.1 Coordinate Systems 

Before describing the most important CDF components, it would be useful to present 
the established system of coordinates used at the experiment. 

The Cartesian coordinate system has its axes starting at the detector's center, 
where the beams of p and p are supposed to collide. The y axis is defined to point 
vertically up, and the x to be perpendicular to the beam pipe and pointing in the 
direction away from the center of the Tevatron ring. In terms of x and y, z is x x y, 
which approximately coincides with the direction in which the p beam travels through 
the center of CDF. 

The cylindrical coordinate system reflects the approximate axial symmetry of the 
tracker and the calorimeter around z, which in cylindrical coordinates remains the 
same unit vector it was in Cartesian. The radial unit vector f at each point is 
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perpendicular to and pointing away from the z axis. The azimuthal angle is by 
definition on the semi-infinite z — x plane that contains the positive x axis and 
increases in the direction of = z x f. 

Spherical coordinates are used more often than the above two systems. The reason 
is that, to the physical event occurring in a pp scattering, the cylindrical or any 
other symmetry of the surrounding detector is irrelevant. The dynamics of the event 
recognize one special axis, viz. z, along which the p and p were traveling right before 
their collision. It is therefore convenient to define the angles of all outcoming particles 
with respect to z. For any point in space, a radial unit vector f is defined to point 
in the direction away from the beginning of the coordinates. Also, a polar angle 
Q is defined, which is along the positive z axis and increases in the direction of 
sinf ■ Fiiis-lly; the azimuthal angle is defined as in the cylindrical coordinates 
and increases along cf) = 9 x r. 

Since the p and p beams are unpolarized, z has to be an axis of symmetry when 
examining a large set of events. In other words, based on the premise of isotropy of 
the universe which leaves z as the only axis special to the scattering, there can be no 
law of physics that would cause a non-uniform distribution of the particles coming 
out of the scattering. 

It is common to not mention the polar angle 9 per se, but instead a dimensionless 
quantity called "pseudorapidity" , which is related to 9 as 



ry = -ln(tan(e/2)). (2.1) 
7] is the — > IpI limit of the quantity called "rapidity" , which is^ 

2 E-p^ 

and has the beautiful property that for any pair of rapidities, the difference Ay is 
invariant under Lorentz boosts along the z axis. 



^Thc rapidity y may not be confused with the Cartesian coordinate y. 
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2.2.2 Tracking 

Tracking is crucial for particle identification; it has been so since the first experiments 
with wire and bubble chambers. Though technology has advanced, the principles 
remain: 

• Only ionizing particles leave tracks, which distinguishes them from neutral ones. 

• The curvature of a track under the influence of Lorentz force in the presence of 
a magnetic field 5 is a measure of the transverse momentum px of the particle, 
namely of the projection of its momentum p on the plane transverse to B. 

• The direction of the track can be used to estimate the direction {rj^cf)) in which 
a particle is produced. 

• Being able to observe tracks improves our intuitive understanding of what par- 
ticles are produced in an event. For example, the assembly of tracks within a 
cone is indicative of hadronic jet showers, while isolated tracks are more likely 
leptons^. 

• Extrapolating the tracks of an event down to their origin(s) indicates the posi- 
tion of the event. This can reveal the existence of displaced secondary vertices, 
indicative of the decay of a long-lived particle, such as a meson. It may also 
indicate the existence of multiple pp interactions in the same bunch crossing, 
by observation of multiple primary vertices in the same event. 

Silicon Detector 

The first tracking device particles pass through is the Silicon Detector. Silicon allows 
for a highly granular and radiation tolerant tracker that can survive as near as 1.5 
cm from the collision point [26]. The operation principle of a silicon micro-strip is 
depicted in Fig. 2-4 [3, 27]. 

■^Evcn though t is a lepton, it is common to include only electrons and muons in the term 
"leptons" , because they are easier to identify than r which often decays hadronically, so they consist 
more "clear" leptons in the experimental sense. 



37 



About 722,000 read-out channels come from the Sihcon Detector [28], by far more 
than from any other CDF component. It is separated in three subsystems: LOO, SVX 
and ISL (Fig. 2-5, 2-6). 

LOO is a single layer of single-sided silicon built directly onto the beam pipe, at 1.5 
cm radius. It provides precision position measurement before the particles undergo 
multiple scattering. 

SVX is the heart of the Silicon Detector, consisting of 12 identical wedges in (f). 
Each wedge contains 5 layers of double-sided silicon, oriented parallel to the beam 
pipe at radii from 2.5 to 10.6 cm. On one side, the silicon strips are aligned axially. 
The other side has 90° stereo strips for 3 of the layers, and 1.2° stereo strips for 
the remaining 2 layers. Obviously, the choice of aligning some strips non-axially was 
made to allow for three-dimensional track reconstruction. 

The ISL envelops SVX. It carries 1.2° stereo double-sided silicon in a single layer 
for intermediate radius measurement of central^ tracks and in two layers for tracking 
in the region 1 < |?7| < 2, which is not completely covered by the COT (Fig. 2-6). 

The silicon embedded strips are 8 /im wide [29], which brings the hit's spatial 
resolution down to about 12 /im. This resolution makes it possible to measure the 
impact parameter of a track to 40 /im, with 30 yum uncertainty due to the beam 
width. The Zq, namely the ^-coordinate of the primary vertex, can be measured with 
70 /im accuracy. 

Central Outer Tracker 

The COT [30, 31] is a cylindrical multi-wire open-cell drift chamber surrounding the 
Silicon Detector (Fig. 2-6). 

COT contains Argon- Ethane {Ar — C2Hq) in a 1:1 mixture. When charged par- 
ticles traverse the gaseous mixture they leave a trail of ionization electrons, which 
drift under the influence of an 1.9 kV/cm electric field. The latter is produced by 
field planes and homogenized by potential and shaper wires. After some time that 

^Here and below the word "central" is used to describe objects with |?7dct| < 1-0; "plug" is used 
to describe objects with 1.0 < |?7dct| < 2.5. 
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Figure 2-4: Schematic of a silicon particle sensor. An array of finely spaced p-type 
silicon strips is implanted in an n-type silicon substrate, typically 300 /im thick. The 
n-p contact is then reversely polarized, typically with a depletion voltage of 150 V. 
When an ionizing particle traverses the depletion zone it creates a localized stream 
of e~-hole pairs, which are collected by the nearest strips, where after amplification 
they are detected as small current signals. There are variations in the design of silicon 
strips, such as double-sided strips where signals are read from both sides. The spatial 
resolution of the most advanced silicon strip can be as fine as 2 - 4 /xm, limited mostly 
by diffusion [3, 27]. 
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LAYER 00 SVX II INTERMEDIATE SILICON LAYERS 

Figure 2-6: Schematic profile (RZ view) of tlie central part of the CDF detector [29]. 
The Time Of Fhght detector (not shown) is between the COT and the solenoid. The 
central electromagnetic and hadronic calorimeter are not depicted either. 



40 



+ Potential wires 



• Sense wires 



X Shaper wires 



— Field Panel 



Bare Mylar (electrostatic shielding) 




R 



52 



54 



56 



58 



60 



62 



64 



66 



r (cm) 



Figure 2-7: Three COT cells from the second superlayer (XY view). Their inclination 
with respect to the radial direction is equal to the Lorentz angle of 35° (see text). 

depends on the distance they travel, the ionization electrons are collected by sense 
wires immersed in the gas producing a detectable^ electric signal. The r — location 
of the track with respect to the sense wire is then estimated from the time it takes to 
detect the signal. The drift distance is less than 0.88 cm and is covered in less than 
100 ns, which is less than the 396 ns between successive bunch crossings, therefore 
causes no pile-up of signals from different events. 

The field panels, shape, potential and sense wires are all grouped in electrostati- 
cally shielded cells (Fig. 2-7). Each cell contains 12 sense, 13 potential and 4 shaper 
wires. Sense and potential wires alternate with successive sense wires being 7mm 
apart. Combining drift time information from several wires, the single hit resolution 
reduces to about 140 /im. 

Cells are arranged in 8 superlayers (Fig. 2-8). The wires in the 1^^ and 5^^ 
superlayer are not oriented axially, but at a stereo angle of -|-3°. Similarly, there is a 

^When an ionization electron approaches the 40^m thick sense wire it is accelerated by its rapidly 
increasing (l/r) electric field, producing an "avalanche" of secondary ionization electrons and thus 
enhancing the signal. 
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Figure 2-8: Part of the COT endplate (XY view). The wire-plane slots grouped into 
eight superlayers are shown. 

stereo angle of —3° in superlayers 3 and 7. Like in the case of the Silicon Detector, the 
reason that 4 out of the 8 superlayers are oriented non-axially is to allow for tracking 
in the three dimensions^. 

It was mentioned that ionization electrons drift under the influence of an electric 
field E, but there is also a magnetic field B parallel to the z axis. So, as the force —eE 
accelerates the electron, the force —ev x B turns it on the x — y plane (Fig. 2-9). At 
any time the velocity of the electron in the medium can be parametrized as = fiE, 
where fi is the mobility of the medium. Assuming that the E field is homogeneous 
on the X — y plane and the electron is non-relativistic, the equilibrium is at an angle 



B 



tpL is called the Lorentz angle and 



ip with respect to E that is ipL = arctan fi 
for the COT it is about 35°. The wires in the COT cells are then arranged along the 
direction determined by the Lorentz angle, to minimize the drift time and maximize 
the COT efficiency and resolution (Fig. 2-7). 



^If all COT wires were parallel to the z axis, then the z coordinate of hits would be unknown. 
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Figure 2-9: The trajectory of an ionization electron in the E and B field of the 
COT. The condition eEsinipL = evB = e^E cosipiB determines the Lorentz angle 
4>L = arctan fiB. 

Magnet 

A 1.4 T magnetic field is produced in the —z direction by the superconductive solenoid 
surrounding the COT (Fig. 2-6 and 2-3). 

The magnetic field is essential for the measurement of the transverse momentum 
(Pt) of ionizing particles. Greater magnetic field intensity and bigger tracking vol- 
ume radius improve px resolution, which on the other hand is limited by the spatial 
resolution of the tracker and multiple scattering [3]. At CDF, the px resolution is 

Track reconstruction 

The Silicon Detector and the COT record a large number of hits in each event, 
viz. discrete positions from which ionizing particles seem to have passed. But the hits 
alone do not suffice. In each event there are tens of charged particles, as well as false 
hits. What is needed is an algorithm to reconstruct tracks out of the thousands of 
hits of each event. 

Every track is a helix that can be parametrized in terms of the variables in Table 
2.1. Essentially, tracking algorithms fit for those 5 parameters to best match the 
observed hits [32, 33]. 

Tracking in the COT using the Segment Linking algorithm involves first recon- 
structing linear segments of the track in each of the eight superlayers [33] . Then, the 
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6 the polar angle at minimum approach, which refers to the point of the 
track closest to the z axis. 

C semi-curvature of the track (inverse of diameter), with the same sign as 
the particle's electric charge. 
z coordinate at minimum approach. 

D signed impact parameter: distance between helix and the z axis at min- 
imum approach. The sign of D is given from its formal definition: 
D = sign(g)(^a;o + Vq — p), where q is the ionizing particle's charge, 
{xQ,yo) is the center of the track's projection onto the x — y plane, and p 
is the radius of the same projection. Fig. 2-10 demonstrates combinations 
of positive and negative D and C. 

00 Direction of track on x — y plane at minimum approach, i.e. the polar 
angle of the particle's pt at minimum approach. 

Table 2.1: The 5 parameters of a helical track. 



Sign of the 
impact parameter D 



1 . positively charged, D positive 

2. negatively charged, D positive 

3. positively charged, D negative 

4. negatively charged, D negative 




Figure 2-10: Combinations of positive and negative D and C (see Table 2.1) 
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Figure 2-11: Schematic of the Histogram Tracking method. 



linear segments from the axial layers are linked to form a 2D track on the x — y plane, 
starting the extrapolation with the outmost segment as seed. The r — z projection of 
the track is attained by linking the segments from the stereo superlayers. Eventually, 
the track is characterized by the of fit, and is only kept if that figure of merit 
is below threshold. 

An alternative is the Histogram Tracking algorithm [33]. It starts with a coarse 
approximation of the final track, which is attained by extrapolating a segment of 
the track called "telescope", such as the outer superlayer segment. The extrapolated 
telescope corresponds to a helix whose parameters carry large uncertainty, therefore 
instead of a curve it can imagined as a tube, to visualize those uncertainties (Fig. 2- 
11). In each layer the tube crosses there may be hits that fall inside the tube. For 
those hits, the likelihood is calculated to belong to the track. Each crossed layer 
is translated into a histogram of those likelihoods. Those histograms coming from 
different layers are then combined into a final one, and the track is reconstructed 
as the helix which maximizes the combined likelihood. Compared to the Segment 
Linking algorithm, this alternative is slower but more efficient in cases of missing and 
accurate in cases of spurious hits. 

The Histogram Tracking algorithm is also applied in Silicon tracking, where the 
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part of the track in the COT is used as the telescope. 

In Sihcon tracking [33], the information of the z of the primary vertex is used. 
That is known by combining hits from the stereo strips and extrapolating to the beam 
axis. This produces a variety of candidates, each of different likelihood, so in the end 
the primary vertex is at the most likely z. 

The Stand- Alone algorithm for Silicon tracking uses information exclusively from 
silicon hits, therefore has the advantage of using the whole |?7| < 2 acceptance of 
the Silicon Detector. It starts by finding hits in places where axial and stereo strips 
intersect. Then, triplets of aligned hits are identified. The information of the primary 
vertex is used to constrain the candidate helices. In the end the best fitting helix is 
kept. 

The Outside-In algorithm [34] takes COT tracks and extrapolates them into the 
Silicon Detector, adding hits via a progressive fit. As each layer of silicon is encoun- 
tered, a road size is established based on the error matrix of the track. Hits that are 
within the road are added to the track, and the track parameters and error matrix are 
refit with this new information. A new track candidate is generated for each hit in 
the road, and each of these new candidates are then extrapolated to the next layer in, 
where the process is repeated. As the extrapolation proceeds, the track error matrix 
is inflated to reflect the amount of scattering material encountered. At the end of 
this process, there may be many track candidates associated with the original COT 
track. The candidate that has hits in the largest number of silicon layers is chosen as 
the winner; if more than one candidate has the same number of hits, the of the fit 
in the silicon is used to decide. 

The Inside-Out algorithm [35] performs the reverse extrapolation: from the Silicon 
Detector to the COT. Its goal is to use the Stand-Alone silicon track to associate it 
with COT hits and improve the efficiency of reconstruction of tracks that do not cross 
more than 4 COT superlayers. 
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2.2.3 Calorimetry 

CDF is equipped with sampling electromagnetic and hadronic calorimeters in the 
central and plug region, enhanced with shower maximum and preshower detectors 
for improved particle identification [26]. Central calorimeters cover 2tt rads in (p 
(Fig. 2-2). The central electromagnetic calorimeter covers \r]\ < 1.1 and the hadronic 
|?7| < 1.3. The plug calorimeters reach as far as |?7| = 3.6. They are segmented in 
wedge-shaped towers pointing to the center of CDF. Each tower covers about 0.1 
units of 7] and 15° in (Fig. 2-3). For increased acceptance, the hadronic calorimeter 
has the endwall calorimeter, spanning 30° < |90° — 6\ < 45° (Fig. 2-6). 

Electromagnetic Calorimeter 

CEM and PEM comprise lead absorber sheets alternating with scintillator layers. 
Light produced at the scintillator is transfered by WLS fibers to two PMTs that 
correspond to each tower^. 

The CEM has a total maximum thickness of about 19 Xq, in 20-30 (varying with 
|?7|) layers of 3 mm lead and 5 mm scintillator. Its energy resolution, after in situ 
calibration, is found to be 13.5%/ \/Et © 2%. 

PEM contains 22 layers of lead, 4.5 mm each^, and its scintillator layers are 4 mm 
thick. Its total thickness is 21 Xq. Its resolution is 16%/\/Et® 1%. 

In both CEM and PEM, there is a shower maximum detector, 6 Xq into the 
calorimeter, where an electromagnetic shower statistically contains the biggest num- 
ber of particles [3]. CES is a multi-wire proportional chamber with strip readout 
in the z direction and wire along 0. PES has scintillator strips that cross to form 
a 2-dimensional grid in each plug. With resolution of about 2 mm in the central 
and 1 mm in the plug, the showermax detectors facilitate the matching of tracks 
with calorimeter hits, improving identification. Also, sampling the profile of the 
electromagnetic showers at 6 Xq allows for improved 7/71° identification. 

'^Having two PMTs per tower allows for cross-check of the validity of signals, using time infor- 
mation and comparing the difference in the signal intensity in the two. 

^The first layer is an exception, being 1 cm thick and read out separately to be used as a preshower 
detector. 
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Finally, between the solenoid and the first layer of the CEM lies a set of multi-wire 
proportional chambers, the CPR, which samples the electromagnetic showers at 1.075 
Xq, viz. the solenoid's thickness. This information greatly enhances 7 and soft 
identification [26]. 

Hadronic Calorimeter 

The hadronic calorimeter is similar to the electromagnetic, except that it uses iron for 
absorber instead of lead. The CHA is 4.7 Aq thick, consisting of 32 2.5 cm iron layers 
alternating with 1 cm scintillator layers. Its energy resolution is 75% / a/ Et © 3%. 

The WHA has similar energy resolution [36]; \JEt © 4%. It contains 15 

layers of iron, 5 cm each, alternating with 1 cm layers of scintillator, adding up to 4.5 
Ao. 

The PHA is thicker, containing 7 Aq in 23 layers of iron, 51 mm each, alternating 
with 6 mm layers of scintillator. Its energy resolution is 80%/A/Er © 5%. 

2.2.4 Muon System 

CDF is equipped with four muon detectors (Fig. 2-12), which will be described in 
this section. 

Muons weigh 200 times more than electrons, therefore radiate about 200^ = 40, 000 
times less by bremsstrahlung. They do not deposit much energy in the calorimeter, 
but rather traverse the whole detector almost unimpeded. This makes them easier to 
identify by installing wire chambers around the detector, beyond the calorimeter and 
even beyond extra absorbing material; muons are virtually the only ionizing particles 
that can reach there. 

Shielding the muon detectors behind absorber increases the detected muons' pu- 
rity, but also enhances multiple scattering, which makes it harder to match the small 
track segment in the muon detector (called "stub") with the corresponding COT 
track. However this is not a very big problem, especially for high-pj^ muons, since the 
displacement due to multiple scattering is about , for the pt is in GeV/c [26]. 
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Figure 2-12: The muon detectors of CDF. 



Furthermore, some low-p^- muons can not reach the muon detectors, but that is not 
a problem either, since the threshold is lower than 2.2 GeV/c [26], far lower than the 
Pt of the muons considered in this analysis. 

Central Muon detector (CMU) 

The CMU [26] surrounds the hadronic calorimeter, at radius 3.47 m, covering the |?7| < 
0.6 region. It consists of argon-ethane wire chamber cells operating in proportional 
mode, organized in stacks of four. Each wire chamber is 2.7 x 6.4 x 226 cm^ with a 
resistive stainless steel wire along its biggest dimension, which is aligned parallel to 
the z axis. In it is segmented in 24 wedges, each containing 4 stacks side by side, 
therefore each wedge contains a chamber of 4 x 4 = 16 cells (Fig. 2-13). 

The drift times (< 800 ns) are used to measure the r — projection of the track. 
The z coordinate of the track is extracted with about 10 cm precision, using the 
charge division method, whose principle is explained in Fig. 2-14. To apply this 
method, every couple of 0— adjacent cells have their wires ganged together at one 
end. 

Central Muon Upgrade detector (CMP) 

The CMP (Fig. 2-12) is shielded behind about 7.8 Aq, comprising the calorimeter, 
the magnet return yoke and extra steel absorber. Compared to the CMU, which was 
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Figure 2-13: Cross section of a CMU chamber. Each vertical array is one stack. 




Figure 2-14: The principle of charge division method. The ionization charge is col- 
lected at some position d along the z axis, and splits into two currents: Ii and l2- 
From Ohm's law, hR{l + ^) = hR^ hi2L - d) = hd ^ d = With the 

approximation that all currents last for the same amount of time At, we can write 
Qi = /jConstAt- Therefore, by measuring Qi and Q2 one can determine d = tt^St"- 
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shielded behind only 5 hadronic interaction lengths, the CMP provides higher purity 
in muon identification [26]. Those reconstructed muons that have a stub in both the 
CMU and the CMP are called "CMUP muons". 

The CMP is not azimuthally symmetric, but resembles a box surrounding the 
central region of the detector (|?7| < 0.6). It is made of wire chambers similar to those 
used for the CMU, but just bigger: 2.5 x 15 x 640 cm^. 

A bigger difference is that CMP contains scintillator counters in addition to wire 
chambers. The scintillator layers lie on the outer side of the chambers and provide 
timing information that is used to discard out-of-time muon candidates, which could 
not possibly be muons originating from the center of the detector. Furthermore, 
timing helps not have stubs from different bunch crossings piled up, given that the 
drift time in the CMP can be as large as 1.7 fis [26]. Eventually, the dimensions of 
the scintillator counters are 2.5 x 30 x 320 cm^, so two silicon counters are needed to 
cover the z dimension of the CMU, providing the very crude information of whether 
a muon stub has positive or the negative z coordinate. 

CMX 

CMX [26] is very similar to CMP; it consists of same type wire chambers and silicon 
counters. It differs significantly in geometry though. It covers the region 0.6 < |?7| < 1 
and is shaped like a conic section on each side of the detector (Fig. 2-12). The wire 
chambers are grouped in wedges, each 15° in 0. Each wedge contains 48 chambers, 
arranged in 8 layers. The lower 90° of the CMX, which physically penetrate the floor 
supporting the detector, are called "miniskirt" for obvious reason (Fig. 2-12). This 
part was not instrumented until past 2003. 

IMU 

IMU [26] covers the region 1 < |?7| < 1.5 (Fig. 2-12). It comprises silicon counters and 
wire chambers of dimensions 2.5 x 8.4 x 363 cm'^. In combination with ISL tracking, 
it provides muon reconstruction and momentum measurement in the |?7| > 1 region. 
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2.2.5 Cerenkov Luminosity Counter 

CDF is equipped with the CLC [37], a detector dedicated to measuring instantaneous 
luminosity (C). It consists of 2 x 48 Cerenkov counters placed in the far forward 
and backward region (3.75 < \t]\ < 4.75). filled with isobutane at nearly atmospheric 
pressure. 

The number of pp interactions (n) in a bunch crossing follows the Poisson distribu- 
tion with mean fi = appCtBCi where app is the cross section of inelastic pp scattering 
and Ibc is the time interval between bunch crossings. 

Bunch crossings with n = occur with probability Poin) = e~^. By measuring 
the fraction of empty crossings /i can be measured^ and therefore C 

An alternative method consists in measuring directly as N/Ni, where is the 
number of CLC counts of some bunch crossing, and A'^i is the average number of CLC 
counts in the case of single-interaction bunch crossings. A'^i can be measured at low 
C, when /i ^ 1. 

The first method, of measuring empty crossings, has the advantage of not needing 
any information such as A'^i, but at high C empty crossings become rare, making 
this method inefficient. On the other hand, the second method depends on the A'^i 
information, and N/Ni in reality does not scale linearly with C, as the CLC occupancy 
grows and is eventually saturated due to the finite number of counters, therefore 
correction for this non-linearity are required. 

The uncertainty in the integrated luminosity measured with the CLC is 6%, to 
which the biggest contribution comes from the uncertainty in app at L96 TeV. 

2.2.6 Data Acquisition 

CDF employs approximately 10^ readout channels. A bunch crossing at £ ~ 2 x 10'^^ 
s^^cm"^ yields on average about 5 pp interactions. An event of such multiplicity takes 
about 200 kB of digitized information volume. It becomes then obvious that not every 
single bunch crossing can be read, as that would require the enormous bandwidth of 

^Of course it is necessary to correct the measured ^ by dividing with the CLC acceptance e. 
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Figure 2-15: Diagram of the CDF DAQ system. 



-630 GB/s. 

Apart from technically inevitable, it is also sensible to record only those events 
that pass some quality selection and would be of some interest^''. For example, an 
event with leptons should be retained, while for multi-jet events it is enough to keep 
only a fraction of them, since they are so abundant in pp collisions. 

The DAQ system [26] is responsible for selecting the best events as they occur. 
Fig. 2-15 provides an overview of the DAQ architecture. 



Level- 1 



The frequency of 2.5 MHz at which bunches cross is too high to allow for full re- 
construction of every event, so the first level of selection is based on fragments of 
information. This happens in Level-1; an accept/reject decision is made using "prim- 

^°In an experiment of the broad scope of CDF it is not trivial to decide which events could be 
of some interest, since different analyses may see interest in different kinds of events. Furthermore, 
nobody is certain what the signature of physics beyond the Standard Model will be. 
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itives" , namely coarse information on COT tracks and stubs in the CMU, CMP and 
CMX [26]. Systems providing primitives are depicted in Fig. 2-16. The XFT crudely 
reconstructs COT tracks on the x — y plane. The XTRP extrapolates XFT tracks 
through the calorimeter and the muon system finding matching hits/towers. 

Based on the primitives, several algorithms — also called "individual triggers" — 
contribute to the Level-1 decision. For example, effort is made to keep events with 
high-pT tracks, or leptons, or large missing transverse energy (^t) etc. 

The latency of Level-1 is 5.5 fis, in which 14 bunch crossings occur. Therefore, 
all front-end electronics are equipped with buffers of enough capacity to contain in- 
formation from 14 bunch crossings. Level-1 then works as a synchronous pipeline; 
by the time 14 events are pushed back into the buffer, at least one event has been 
examined and pulled from it, freeing a slot for the current event to be buffered. 

Less than 2% of the events pass Level-1, making its accept rate less than 50 kHz. 

Level-2 

Level-2 functions as an asynchronous pipeline, where events are processed in FIFO 
mode [26]. With no more than 50 kHz input rate, it can afford up to 1/50 kHz = 20 
yus to decide on each event^^. 

In its decision, Level-2 takes into account the primitives of Level-1, in addition to 
showermax information, as shown in Fig. 2-16. 

The acceptance rate of Level-2 is less than 1 kHz. Effort is made to maintain 
this rate as close to 1 kHz as possible, by readjusting the trigger requirements as C 
changes, making them stricter at high C and looser at low C. 

Event Builder 

In the case of a Level-2 accept, the whole detector is eventually read out. The EVB 
collects the fragments of the event and passes them to Level-3. Reading out the 
front-end electronics of the whole detector takes about 1 ms, which is why this step 

^^Actually, since up to 4 events can be kept in the Level-2 buffer, the latency can be even greater, 
without causing dead-time, provided that this is not the case for too many events. 
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Figure 2-16: Information flow within Level-1 and Level-2. XCES is a system that 
generates the stimulated showermax bitmap and finds matching tracks extrapolated 
by XTRP to define electron candidates. The SVT extrapolates XFT tracks into the 
SVX, providing the D and 0o information (Table 2.1). The TSI coordinates the flow 
of information and interfaces to the CDF clock, which is used to know when a bunch 
crossing is occurring. 
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Figure 2-17: Diagram of the Event Builder. 



is only possible after having discarded over 99.96% of the events. 

EVB (Fig. 2-17) lies in 21 VME crates, each containing one Linux computer, 
referred to as SCPU [38]. Each crate is dedicated to reading a different part of the 
detector. Apart from the SCPU, each crate contains a series of memory buffers, the 
VRBs. When the front-end crates are read, the information of the event is first stored 
in the VRBs. Each SCPU reads the VRBs of it own crate through the VME backplane 
of the crate, which in combination with the GigaBit Ethernet networking allows for 
the desired system speed. On reading the VRBs, a byte-count check is performed, as 
well as checks of the size of each buffer entry [39]. Though in principle EVB should 
not be discarding any events, it does so if information is missing or corrupted. 

The function of the EVB is coordinated by the EVB Proxy, a process running 
on a dedicated Linux machine. All acknowledgement messages within the EVB are 
circulated through the EVB Proxy, and so does any information exchanged with the 
TSI and Level-3. 
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Level-3 

Level-3 is the last stage of trigger selection [38]. Receiving events from the EVB at 
< 1 kHz, it is purely software implemented, performing three basic functions: 

1. Concatenates same-event fragments coming from the EVB into an event entry. 

2. Imposes the final selection, taking into account the reconstructed objects infor- 
mation. 

3. Submits passing events to the CSL for storage. 

There is a whole cluster of 411 Linux computers counting 2.4 THz of CPU ded- 
icated to Level-3. Though all computers are nearly identical, they are separated in 
three categories, depending on their task: 

• 18 Converter nodes: They receive event fragments from the EVB and combine 
them to form self-contained event records which they pass to available Processor 
nodes. 

• 384 Processor nodes: Upon reception of events from a Converter, they apply 
the Level-3 filter to either discard or pass them to an Output node, after some 
reformatting that reshapes the passing entries to their final format. 

• 9 Output nodes: They receive the passing events from Processor nodes and 
propagate them to the CSL for storage. 

The Level-3 cluster is separated in 18 identical subsets, called "subfarms"^^. This 

way, data handling proceeds in 18 independent, parallel streams which share the 

load of incoming events. Each subfarm contains 1 Converter, 21 or 22 Processors, 

and shares an Output with another subfarm. On every Processor, 5 Level-3 filters 

run simultaneously, on hyper-threaded dual-core Intel CPUs. The Converter of each 

subfarm is allowed to only submit events to Processors of its own subfarm, and the 

Processors of each subfarm can only send events to the Output node serving it. 

^^A term appropriate for a subdivision of the whole Level-3 eluster, which is called "farm" in CDF 
jargon. 
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The operation of Level-3 is coordinated by the Level-3 Proxy apphcation, running 
on a dedicated computer. The Proxy collects and sends acknowledgements from and 
to the computers of the cluster, and communicates with the EVB Proxy to indicate 
among other things which Converter is available to receive the next event. 

Filtering is done by a program written in C++, the Level-3 filter executable, 
which applies criteria stored in a centralized database implemented in Oracle. In 
the database is stored the trigger table, which is a list of "triggers". Each trigger is 
structured to contain the following information: 

1. The prerequisite Level-1 and Level-2 triggers. 

2. The C++ reconstruction modules that should be used and in what order. 

3. The specific selection criteria decided having some physics goal, for example a 
cut in some invariant mass in the event. 

4. The name of the dataset in which to store the event if it passes the trigger 
selection. 

The output rate of Level-3 is about 100 Hz. The events passing Level-3 are sent 
to the CSL for immediate storage. From there, they are shortly sent to the FCC for 
permanent storage on magnetic tape. 

2.2.7 Off-line production 

Data analysis is not performed on the raw data. Before the data on tape are usable, 
the off-line production process has to take place. 

At production [26], the raw data banks are unpacked and physics objects are 
reconstructed in full detail. This is similar to what is done at Level-3, but the off-line 
reconstruction is much more elaborate, applying the latest calibrations, since those 
reconstructed objects will be the final ones to be used for analysis. 

Since passing Level-3, each event contains the information of the dataset (s) it be- 
longs to. At the production, even further partitioning is made; datasets are collections 
of filesets, which are collections of files containing events. 
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For the needs of each analysis, the raw data are taken from the appropriate dataset 
and are converted to a convenient format. Since ROOT [40] is the adopted analysis 
framework, the format varies between different architectures of ROOT Trees. For 
example, one is the "topNtuple" , used mostly by collaborators doing t-quark analy- 
ses, but a more common format, used also in the present analysis, is the "Standard 
Ntuple" (Stntuple). 
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Chapter 3 
Data Analysis 



The analysis going into this thesis was conducted in two rounds: first with 1 fb~^ of 
data, and then with 2 fb~^. The first round has been documented in [41, 42, 43, 44]. 
An updated pubhcation is currently being prepared for the second one. This chapter is 
an adaptation of [41], while chapter 4 presents material that will be in the publication 
of the second round. 

3.1 Strategy 

Sec. 1.3 motivates the goal of this analysis, viz. the model- independent search for new 
physics. The method is to obtain a satisfactory description of the Standard Model 
expectation in channels where high-p^ data are observed, and employ an array of 
probes to seek for statistically significant discrepancies between data and Standard 
Model background. 

Crucial for model-independence is to not focus on channels sensitive to particular 
models, but examine data in as many channels as possible. That introduces to this 
analysis over two million events (in 1 fb~^), ranging from abundant QCD to rare 
electroweak ones. Studying this large volume of qualitatively diverse data requires 
reducing the information content of each event to bare bones and characterizing each 
event in terms of physics objects that maintain the same meaning universally in any 
kind of event. In each event, the 4-momenta of any reconstructed physics objects in 
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its final state are recorded. These objects can be leptons, photons, hadronic jets or 
missing energy. 

Another ingredient of model-independence is to not segregate the data into "con- 
trol" and "signal" regions a priori, namely into regions where new physics is assumed 
to not exist or to exist respectively. In most analyses control regions are predefined, 
to adjust correction factors, under the assumption that there is no new physics in 
those regions and that the extrapolation of correction factors from the control to the 
signal region is valid. However, what is considered control region in one analysis is 
often signal region in some other, so, to be as generic as possible, one needs to treat 
all data as signal and control regions simultaneously, to address the question "how 
well does the Standard Model implementation describe the data?" If there is indeed 
detectable new physics, then it will be impossible to achieve good agreement between 
data and Standard Model simultaneously in all regions. More in Sec. C. 

The Standard Model prediction is implemented in three steps: 

1. Monte Carlo generation and matching [45] of samples simulating the Standard 
Model processes. 

2. CDF detector simulation, which models the detector response to the MC gen- 
erated events. For that, the GEANT-based package CDFsiM is used. 

3. Fine-tuning of the outcome of CDFsiM to account for theoretical and experi- 
mental correction factors. 

Structurally, the analysis contains four parts: 

1. The Vista global fit, which adjusts and applies the correction model, providing 
the Standard Model background of the best possible global agreement with the 
data, exploiting the flexibility granted by the correction model. 

2. The Vista comparison, which examines the statistical significance of features 
in the bulk of all distributions and sorts the information in a comprehensive 
way. 
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3. The Sleuth search, which focuses on the high-^p^ tails searching for excesses 
of data. 

4. The Bump Hunter search (present only for the second round of the analysis), 
which scans all mass variables for local excesses of data, potentially indicating 
a new resonance. 

The above statistical probes are employed simultaneously, rather than sequentially. 
So, an effect highlighted by Sleuth prompts additional investigation of the discrep- 
ancy, usually resulting in a specific hypothesis explaining the discrepancy in terms 
of a detector effect or adjustment to the Standard Model prediction that is then fed 
back and tested for global consistency. 

Statistical significance is a necessary but insufficient condition for discovery. A 
statistically significant discrepancy could be attributed to inaccuracy in the Standard 
Model implementation, or in modeling the detector response. These possibilities 
would need to be considered on a case-by-case basis. In the event of a significant 
discrepancy, the breadth of view of this analysis can be exploited to evaluate the 
plausibility of it being a detector effect or a problem in the Standard Model imple- 
mentation. 

Forming hypotheses for the cause of specific discrepancies, implementing those 
hypotheses to assess their wider consequences, and testing global agreement after the 
implementation are emphasized as the crucial activities for the investigator through- 
out the process of data analysis. This process is constrained by the requirement that 
all adjustments be physically motivated. The investigation and resolution of dis- 
crepancies highlighted by the algorithms is the defining characteristic of this global 
analysis ^. 

This search for new physics terminates when either a compelling case for new 

physics is made, or there remain no statistically significant discrepancies on which a 

new physics case can be made. In the former case, to quantitatively assess the sig- 

^It is not possible to systematically simulate the process of constructing, implementing, and test- 
ing hypotheses motivated by particular discrepancies, since this process is carried out by individuals. 
The statistical interpretation of this analysis is made bearing this process in mind. 
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nificance of the potential discovery, a full treatment of systematic uncertainties must 
be implemented. In the latter case, it is sufficient to demonstrate that all observed 
effects are not in significant disagreement with an appropriate global Standard Model 
description. 

3.2 Vista 

This section describes Vista: object identification, event selection, estimation of 
Standard Model backgrounds, simulation of the CDF detector response, development 
of a correction model, and results. 

3.2.1 Object identification 

Energetic and isolated electrons, muons, taus, photons, jets, and 6-tagged jets with 
|?7det| < 2.5 and pt > 17 GeV are identified according to CDF standard criteria. The 
same criteria are used for all events. The isolation criteria employed vary according 
to object, but roughly require less than 2 GeV of extra energy fiow within a cone of 

in rj-cp space around each object. 

Standard CDF criteria [46] are used to identify electrons (e^) in the central and 
plug regions of the CDF detector. Electrons are characterized by a narrow shower in 
the central or plug electromagnetic calorimeter and a matching isolated track in the 
central gas tracking chamber or a matching plug track in the silicon detector. 

Standard CDF muons (/i^) are identified using three separate subdetectors in 
the regions |?7dct| < 0.6, 0.6 < |?7dct| < 1-0, and 1.0 < |?7dct| < 1-5 [46]. Muons are 
characterized by a track in the central tracking chamber matched to a track segment in 
the central muon detectors, with energy consistent with minimum ionizing deposition 
in the electromagnetic and hadronic calorimeters along the muon trajectory. 

Narrow central jets with a single charged track are identified as tau leptons (r^) 
that have decayed hadronically [47] . Taus are distinguished from electrons by requir- 
ing a substantial fraction of their energy to be deposited in the hadron calorimeter; 
taus are distinguished from muons by requiring no track segment in the muon detec- 
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tor coinciding with the extrapolated track of the tau. Track and calorimeter isolation 
requirements are imposed. 

Standard CDF criteria requiring the presence of a narrow electromagnetic cluster 
with no associated tracks are used to identify photons (7) in the central and plug 
regions of the CDF detector [48]. 

Jets (j) are reconstructed using the JetClu [49] clustering algorithm with a cone 
of size AR = 0.4, unless the event contains one or more jets with pt > 200 GeV and 
no leptons or photons, in which case cones of AR = 0.7 are used. Jet energies are 
appropriately corrected to the parton level [50]. Since uncertainties in the Standard 
Model prediction grow with increasing jet multiplicity, up to the four largest pt jets 
are used to characterize the event; any reconstructed jets with p^^-ordered ranking of 
five or greater are neglected and their energy is treated as unclustered, except in final 
states with small summed scalar transverse momentum containing only jets. 

A secondary vertex 6-tagging algorithm is used to identify jets likely resulting 
from the fragmentation of a bottom quark (b) produced in the hard scattering [51]. 

Momentum visible in the detector but not clustered into an electron, muon, tau, 
photon, jet, or 6-tagged jet is referred to as unclustered momentum (unci). 

Missing momentum {^) is calculated as the negative vector sum of the 4-vectors 
of all identified objects and unclustered momentum. An event is said to contain a ^ 
object if the transverse momentum of this object exceeds 17 GeV, and if additional 
quality criteria discriminating against fake missing momentum due to jet mismea- 
surement are satisfied ^. 



^An additional quality criterion is applied to the significance of the missing transverse momen- 
tum ;^rp in an event, requiring that the energies of hadronic objects can not be adjusted within 
resolution to reduce the missing transverse momentum to less than 10 GeV. The transverse compo- 
nents of all hadronic energy clusters pxi in the event are projected onto the unit missing transverse 

momentum vector = ^rp/ , and a "conservative" missing transverse momentum ^t' = / 



Pt — 2.5y pxi ■ is defined, where the sum is over hadronic energy clusters in the event, and 

the hadronic energy resolution of the CDF detector has been approximated as 100%^pTi, expressed 
in GeV. An event is said to contain missing transverse momentum if j^t > 17 GeV and -pT > 10 GeV. 
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3.2.2 Event selection 

Events containing an energetic and isolated electron, muon, tau, photon, or jet are 
selected. A set of three level online triggers requires: 

• a central electron candidate with pj- > 18 GeV passing level 3, with an asso- 
ciated track having pt > 8 GeV and an electromagnetic energy cluster with 
Pt > 16 GeV at levels 1 and 2; or 

• a central muon candidate with px > 18 GeV passing level 3, with an associated 
track having px > 15 GeV and muon chamber track segments at levels 1 and 2; 
or 

• a central or plug photon candidate with p^ > 25 GeV passing level 3, with 
hadronic to electromagnetic energy less than 1:8 and with energy surrounding 
the photon to the photon's energy less than 1:7 at levels 1 and 2; or 

• a central or plug jet with px > 20 GeV passing level 3, with 15 GeV of transverse 
momentum required at levels 1 and 2, with corresponding prescales of 50 and 
25, respectively; or 

• a central or plug jet with pt > 100 GeV passing level 3, with energy clusters of 
20 GeV and 90 GeV required at levels 1 and 2; or 

• a central electron candidate with px > 4: GeV and a central muon candidate 
with Pt > 4: GeV passing level 3, with a muon segment, electromagnetic cluster, 
and two tracks with px > 4 GeV required at levels 1 and 2; or 

• a central electron or muon candidate with p-r > 4 GeV and a plug electron 
candidate with px > 8 GeV, requiring a central muon segment and track or 
central electromagnetic energy cluster and track at levels 1 and 2, together 
with an isolated plug electromagnetic energy cluster; or 

• two central or plug electromagnetic clusters with pt > 18 GeV passing level 3, 
with hadronic to electromagnetic energy less than 1:8 at levels 1 and 2; or 
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• two central tau candidates with > 10 GeV passing level 3, each with an 
associated track having px > 10 GeV and a calorimeter cluster with pj- > 5 GeV 
at levels 1 and 2. 

Events satisfying one or more of these online triggers are recorded for further 
study. Offline event selection for this analysis uses a variety of further filters. Single 
object requirements keep events containing: 

• a central electron with px > 25 GeV, or 

• a plug electron with px > 40 GeV, or 

• a central muon with px > 25 GeV, or 

• a central photon with pt > 60 GeV, or 

• a central jet or 6-tagged jet with pt > 200 GeV, or 

• a central jet or 6-tagged jet with pt > 40 GeV (prescaled by a factor of roughly 
10^), 

possibly with other objects present. Multiple object criteria select events containing: 

• two electromagnetic objects (electron or photon) with |?7| < 2.5 and p^ > 
25 GeV, or 

• two taus with |?7| < 1.0 and pr > 17 GeV, or 

• a central electron or muon with pt > 17 GeV and a central or plug electron, 
central muon, or central tau with pt > 17 GeV, or 

• a central photon with px > 40 GeV and a central electron or muon with pt > 
17 GeV, or 

• a central or plug photon with px > 40 GeV and a central tau with px > 40 GeV, 
or 

• a central photon with px > 40 GeV and a central 6-jet with px > 25 GeV, or 

67 



• a central jet or 6-tagged jet with > 40 GeV and a central tau with px > 
17 GeV (prescaled by a factor of roughly 10^), or 

• a central or plug photon with px > 40 GeV and two central taus with pt > 
17 GeV, or 

• a central or plug photon with px > 40 GeV and two central 6-tagged jets with 
Pt > 25 GeV, or 

• a central or plug photon with px > 40 GeV, a central tau with px > 25 GeV, 
and a central 6-tagged jet with pt > 25 GeV, 

possibly with other objects present. Explicit online triggers feeding this offline selec- 
tion are required. The px thresholds for these criteria are chosen to be sufficiently 
above the online trigger turn-on curves that trigger efficiencies can be treated as 
roughly independent of object pr- 

Good run criteria are imposed, requiring the operation of all major subdetectors. 
To reduce contributions from cosmic rays and events from beam halo, standard CDF 
cosmic ray and beam halo filters are applied [52] . 

These selections result in a sample of roughly two million high-pj- data events in 
an integrated luminosity of 927 pb^^ 

3.2.3 Event generation 

Standard Model backgrounds are estimated by generating a large sample of Monte 
Carlo events, using the Pythia [53], MadEvent [54], and Herwig [55] generators. 
MadEvent performs a leading order matrix element calculation, and provides 4- 
vector information corresponding to the outgoing legs of the underlying Feynman 
diagrams, together with color flow information. Pythia 6.218 is used to handle 
showering and fragmentation. The CTEQ5L [56] parton distribution functions are 
used. 
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QCD jets. QCD dijet and multijet production are estimated using Pythia. Sam- 
ples are generated with Tune A [57] with lower cuts on pTi the transverse momentum 
of the scattered partons in the center of momentum frame of the incoming partons, of 
0, 10, 18, 40, 60, 90, 120, 150, 200, 300, and 400 GeV. These samples are combined to 
provide a complete estimation of QCD jet production, using the sample with greatest 
statistics in each range of pT- 

7-|-jets. The estimation of QCD single prompt photon production comes from 
Pythia. Five samples are generated with Tune A corresponding to lower cuts on 
pT of 8, 12, 22, 45, and 80 GeV. These samples are combined to provide a complete 
estimation of single prompt photon production in association with one or more jets, 
placing cuts on pT to avoid double counting. 

77-|-jets. QCD diphoton production is estimated using Pythia. 

l^+jets. The estimation of l^+jets processes (with V denoting W or Z), where the 
W or Z decays to first or second generation leptons, comes from MadEvent, with 
Pythia employed for showering. Tune AW [57] is used within Pythia for these 
samples. The CKKW matching prescription [45] with a matching scale of 15 GeV 
is used to combine these samples and avoid double counting. Additional statistics 
are generated on the high-p^- tails using the MLM matching prescription [58]. The 
factorization scale is set to the vector boson mass; the renormalization scale for each 
vertex is set to the pt of the jet. W+4 jets are generated inclusively in the number 
of jets; Z+3 jets are generated inclusively in the number of jets. 

^V^+jets. The estimation of WW, WZ, and ZZ production with zero or more jets 
comes from Pythia. 

l^7-|-jets. The estimation of W'j and Z7 production comes from MadEvent, with 
showering provided by Pythia. These samples are inclusive in the number of jets. 

W{-^ Th')-\-jets. Estimation oiW ^ tv with zero or more jets comes from Pythia. 
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Z(— s> rr)+jets. Estimation of Z tt with zero or more jets comes from Pythia. 

tt. Top quark pair production is estimated using Herwig assuming a top quark 
mass of 175 GeV and NNLO cross section of 6.77 ± 0.42 pb [59]. 

Remaining processes, including for example vv)^ and Z{-^ i^i^)bb, are 

generated by systematically looping over possible final state partons, using Mad- 
Graph [60] to determine all relevant diagrams, and using MadEvent to perform a 
Monte Carlo integration over the final state phase space and to generate events. The 
MLM matching prescription is employed to combine samples with different numbers 
of final state jets. 

A higher statistics estimate of the high-p-r tails is obtained by computing the 
thresholds in corresponding to the top 10% and 1% of each process, where 

Y2pt denotes the scalar summed transverse momentum of all identified objects in an 
event. Roughly ten times as many events are generated for the top 10%, and roughly 
one hundred times as many events are generated for the top 1%. 

Cosmic rays. Backgrounds from cosmic ray or beam halo muons that interact 
with the hadronic or electromagnetic calorimeters, producing objects that look like a 
photon or jet, are estimated using a sample of data events containing fewer than three 
reconstructed tracks. This procedure is described in more detail in Appendix A. 2.1. 

Minimum bias. Minimum bias events are overlaid according to run-dependent in- 
stantaneous luminosity in some of the Monte Carlo samples, including those used for 
inclusive W and Z production. In all samples not containing overlaid minimum bias 
events, including those used to estimate QCD dijet production, additional unclustered 
momentum is added to events to mimic the effect of the majority of multiple inter- 
actions, in which a soft dijet event accompanies the rare hard scattering of interest. 
A random number is drawn from a Gaussian centered at with width 1.5 GeV for 
each of the x and y components of the added unclustered momentum. Backgrounds 
due to two rare hard scatterings occurring in the same bunch crossing are estimated 
by forming overlaps of events, as described in Appendix A. 2. 2. 
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Each generated Standard Model event is assigned a weight, calculated as the 
cross section for the process (in units of picobarns) divided by the number of events 
generated for that process, representing the number of such events expected in a data 
sample corresponding to an integrated luminosity of 1 pb~^. When multiplied by the 
integrated luminosity of the data sample used in this analysis, the weight gives the 
predicted number of such events in this analysis. 

3.2.4 Detector simulation 

The response of the CDF detector is simulated using a GEANT-based detector simu- 
lation (CDFsim) [61], with GFLASH [62] used to simulate shower development in the 
calorimeter. 

In pp collisions there is an ordering of frequency with which objects of different 
types are produced: many more jets (j) are produced than 6-jets (6) or photons (7), 
and many more of these are produced than charged leptons (e, /i, r). The CDF 
detectors and reconstruction algorithms have been designed so that the probability of 
misreconstructing a frequently produced object as an infrequently produced object is 
small. The fraction of central jets that CDFsiM misreconstructs as photons, electrons, 
and muons is ~ 10"^, ~ 10"^, and ~ 10~^, respectively. Due to these small numbers, 
the use of CDFsiM to model these fake processes would require generating samples 
with prohibitively large statistics. Instead, the modeling of a frequently produced 
object faking a less frequently produced object (specifically: j faking b, 7, e, /i, or 
r; or 6 or 7 faking e, /i, or r) is obtained by the application of a misidentification 
probability, a particular type of correction factor in the Vista correction model, 
described in the next section. 

In Monte Carlo samples passed through CDFsiM, reconstructed leptons and pho- 
tons are required to match to a corresponding generator level object. This procedure 
removes reconstructed leptons or photons that arise from a misreconstructed quark 
or gluon jet. 
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3.2.5 Correction model 



Unfortunately some numbers that can not be determined from first principles enter 
the comparison between data and the Standard Model prediction. These numbers 
are referred to as "correction factors" . This correction model is applied to generated 
Monte Carlo events to obtain the Standard Model prediction across all final states. 

Correction factors must be obtained from the data themselves. These factors may 
be thought of as Bayesian nuisance parameters. The actual values of the correction 
factors are not directly of interest. Of interest is the comparison of data to Standard 
Model prediction, with correction factors adjusted to whatever they need to be, con- 
sistent with external constraints, to bring the Standard Model into closest agreement 
with the data. 

The traditional prescription for determining these correction factors is to "mea- 
sure" them in a "control region" in which no signal is expected. This procedure 
encounters difficulty when the entire high-pj^ data sample is considered to be a signal 
region. The approach adopted instead is to ask whether a consistent set of correction 
factors can be chosen so that the Standard Model prediction is in agreement with the 
CDF high-pT data. 

The correction model is obtained by an iterative procedure informed by observed 
inadequacies in modeling. The process of correction model improvement, motivated 
by observed discrepancies, may allow a real signal to be artificially suppressed. If 
adjusting correction factor values within allowed bounds removes a signal, then the 
case for the signal disappears, since it can be explained in terms of known physics. 
This is true in any analysis. The stronger the constraints on the correction model, 
the more difficult it is to artificially suppress a real signal. By requiring a consistent 
interpretation of hundreds of final states. Vista is less likely to mistakenly explain 
away new physics than analyses of more limited scope. 

The 44 correction factors currently included in the correction model are shown 
in Table 4.2. These factors can be classified into two categories: theoretical and 
experimental. A more detailed description of each individual correction factor is 
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Code 


Category 


Explanation 


Value 


Error 


Error (%) 


0001 


luminosity 


CDF integrated luminosity 


927 


20 


2.2 


0002 


fc-factor 


cosmic 7 


0.69 


0.05 


7.3 


0003 


fc-factor 


cosmic j 


0.446 


0.014 


3.1 


0004 


fc-factor 


Ijlj photon-l-jet(s) 


0.95 


0.04 


4.2 


0005 


fc-factor 


172.7 


1.2 


0.05 


4.1 


0006 


fc-factor 


173.7 


1.48 


0.07 


4.7 


0007 


fc-factor 


l74j+ 


1.97 


0.16 


8.1 


0008 


fc-factor 


270j diphoton(-f-jets) 


1.81 


0.08 


4.4 


0009 


fc-factor 


27IJ 


3.42 


0.24 


7.0 


0010 


fc-factor 


272J + 


1.3 


0.16 


12.3 


0011 


fc-factor 


WOj W i+jets) 


1.453 


0.027 


1.9 


0012 


fc-factor 


Wlj 


1.06 


0.03 


2.8 


0013 


fc-factor 


W2j 


1.02 


0.03 


2.9 


0014 


fc-factor 


W3j+ 


0.76 


0.05 


6.6 


0015 


fc-factor 


ZOj Z (+jcts) 


1.419 


0.024 


1.7 


0016 


fc-factor 


zij 


1.18 


0.04 


3.4 


0017 


fc-factor 


Z2j + 


1.03 


0.05 


4.8 


0018 


fc-factor 


2j pt < 150 


0.96 


0.022 


2.3 


0019 


fc-factor 


2j 150 < Pt 


1.256 


0.028 


2.2 


0020 


fc-factor 


3j Pt < 150 


0.921 


0.021 


2.3 


0021 


fc-factor 


3j 150 < 


1.36 


0.03 


2.4 


0022 


fc-factor 


4j Pt < 150 


0.989 


0.025 


2.5 


0023 


fc-factor 


4j 150 < PT 


1.7 


0.04 


2.3 


0024 


A;-factor 


5j+ 


1.25 


0.05 


4.0 


0025 


ID eff 


p{e e) central 


0.986 


0.006 


0.6 


0026 


ID eff 


p{e e) plug 


0.933 


0.009 


1.0 


0027 


ID eff 


p{fi fi) 77 < 0.6 


0.845 


0.008 


0.9 


0028 


ID eff 


p{^i^ ^i) 0.6< I77I 


0.915 


0.011 


1.2 


0029 


ID eff 


p{l ~^ 7) central 


0.974 


0.018 


1.8 


0030 


ID eff 


P{l 7) plug 


0.913 


0.018 


2.0 


0031 


ID eff 


p{b b) central 


1 


0.04 


4.0 


0032 


fake rate 


p{e 7) plug 


0.045 


0.012 


27.0 


0033 


fake rate 


p{q e) central 


9.71x10-5 


1.9x10-6 


2.0 


0034 


fake rate 


p{q ^ e) plug 


0.000876 


1.8x10-5 


2.1 


0035 


fake rate 


p{q fi) 


1.157x10"^ 


2.7x10-^ 


2.3 


0036 


fake rate 


p{j b) 


0.01684 


0.00027 


1.6 


0037 


fake rate 


p{q ^ t) Pt < 60 


0.00341 


0.00012 


3.5 


0038 


fake rate 


p{q ^ r) 60 < pt 


0.00038 


4x10-5 


10.5 


0039 


fake rate 


p{q 7) central 


0.000265 


1.5x10-5 


5.7 


0040 


fake rate 


p{q 7) plug 


0.00159 


0.00013 


8.2 


0041 


trigger 


p(e trig) central, pT > 25 


0.976 


0.007 


0.7 


0042 


trigger 


p{e trig) plug, Pt > 25 


0.835 


0.015 


1.8 


0043 


trigger 


p{p trig) |?7| < 0.6, PT > 25 


0.917 


0.007 


0.8 


0044 


trigger 


p{ji trig) 0.6 < |r/| < 1.0, pt > 25 


0.96 


0.01 


1.0 



Table 3.1: The 44 factors introduced in the correction model. All values are dimen- 
sionless with the exception of code 0001 (luminosity), which has units of pb-^. The 
values and uncertainties of these correction factors are valid within the context of this 
correction model. 
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provided in Appendix A. 4. 

Theoretical correction factors reflect the practical difficulty of calculating accu- 
rately within the framework of the Standard Model. These factors take the form 
of /c-factors, so-called "knowledge factors," representing the ratio of the unavailable 
all order cross section to the calculable leading order cross section. Twenty-three 
/c-factors are used for Standard Model processes including QCD multijet production, 
W-|-jets, Z+jets, and (di)photon+jets production. 

Experimental correction factors include the integrated luminosity of the data, effi- 
ciencies associated with triggering on electrons and muons, efficiencies associated with 
the correct identification of physics objects, and fake rates associated with the mis- 
taken identification of physics objects. Obtaining an adequate description of object 
misidentification has required an understanding of the underlying physical mecha- 
nisms by which objects are misreconstructed, as described in Appendix A.l. 

In the interest of simplicity, correction factors representing /c-factors, efficiencies, 
and fake rates are generally taken to be constants, independent of kinematic quantities 
such as object pr, with only five exceptions. The pt dependence of three fake rates 
is too large to be treated as approximately constant: the jet faking electron rate 
p{j e) in the plug region of the CDF detector; the jet faking 6-tagged jet rate 
p{j b), which increases steadily with increasing pt] and the jet faking tau rate 
p{j r), which decreases steadily with increasing pt- Two other fake rates possess 
geometrical features in ri-(j) due to the construction of the CDF detector: the jet faking 
electron rate p{j e) in the central region, because of the fiducial tower geometry of 
the electromagnetic calorimeter; and the jet faking muon rate p{j fi), due to the 
non-trivial fiducial geometry of the muon chambers. After determining appropriate 
functional forms, a single overall multiplicative correction factor, determined by the 
global fit, is used 

Correction factor values are obtained from a global fit to the data. The procedure 
is outlined here, with further details relegated to Appendix A. 3. 

Events are first partitioned into final states according to the number and types 
of objects present. Each final state is then subdivided into bins according to each 
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object's detector pseudorapidity (r/det) and transverse momentum (pr), as described 
in Appendix A. 3.1. 

Generated Monte Carlo events, adjusted by the correction model, provide the 
Standard Model prediction for each bin. The Standard Model prediction in each bin 
is therefore a function of the correction factor values. A figure of merit is defined 
to quantify global agreement between the data and the Standard Model prediction, 
and correction factor values are chosen to maximize this agreement, consistent with 
external experimental constraints. 

Letting s represent a vector of correction factors, for the k^^ bin 

= <P''^^W-^^^'W)\ (3.1) 

^/SM\k] +<5SM[A;]2 

where Data[/c] is the number of data events observed in the k^^ bin, SM[/c] is the 
number of events predicted by the Standard Model in the /c*^ bin, 5SM[/i;] is the 
Monte Carlo statistical uncertainty on the Standard Model prediction in the /c*^ 
bin ^, and -y/SM[fc] is the statistical uncertainty on the expected data in the k^^ bin. 
The Standard Model prediction SM[A;] in the k^^ bin is a function of s. 

Relevant information external to the Vista high-pT data sample provides ad- 
ditional constraints in this global fit. The CDF luminosity counters measure the 
integrated luminosity of the sample described in this article to be 902 pb~^ ± 6% by 
measuring the fraction of bunch crossings in which zero inelastic collisions occur [63]. 
The integrated luminosity of the sample measured by the luminosity counters en- 
ters in the form of a Gaussian constraint on the luminosity correction factor. Higher 
order theoretical calculations exist for some Standard Model processes, providing con- 
straints on corresponding /c-factors, and some CDF experimental correction factors 
are also constrained from external information. In total, 26 of the 44 correction factors 

■^Given a set of Monte Carlo events with individual weights Wj, so that the total Standard Model 
prediction from these Monte Carlo events is SM = J2j '^j events, the "effective weight" WoS of these 

events can be taken to be the weighted average of the weights: Wcd = ^^^"'^ . The "effective 

number of Monte Carlo events" is A^off = SM/wes, and the error on the Standard Model prediction 
is SSM = SM/^/N^. 
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are constrained. The specific constraints employed are provided in Appendix A. 3. 2. 
The overall function to be minimized takes the form 



where the sum in the first term is over bins in the CDF high-p^ data sample with Xfc(^) 
defined in Eq. 3.1, and the second term is the contribution from explicit constraints. 

Minimization of x^(^) in Eq. 3.2 as a function of the vector of correction factors 
s results in a set of correction factor values sq providing the best global agreement 
between the data and the Standard Model prediction. The best fit correction factor 
values are shown in Table 4.2, together with absolute and fractional uncertainties. 
The determined uncertainties are not used explicitly in the subsequent analysis, but 
rather provide information used implicitly to assist in appropriate adjustment to the 
correction model in light of observed discrepancies. The uncertainties are verified by 
subdividing the data into thirds, performing separate fits on each third, and noting 
that the correction factor values obtained with each subset are consistent within 
quoted uncertainties. Further details on the correlation matrix and other technical 
aspects of this global fit can be found in Appendix A. 3. 3. 

Although the correction factors are determined from a global fit, in practice the 
determination of many correction factors' values are dominated by one recognizable 
subsample. The rate p{j —>■ e) for a jet to fake an electron is determined largely by the 
number of events in the ej final state, since the largest contribution to this final state 
is from dijet events with one jet misreconstructed as an electron. Similarly, the rates 
p{j — s> b) and p{j r) for a jet to fake a 6-tagged jet and tau lepton are determined 
largely by the number of events in the bj and rj final states, respectively. The 
determination of the fake rate p{j 7), photon efficiency ^(7 — > 7), and /c-factors 
for prompt photon production and prompt diphoton production are dominated by the 
7J) 7JJ) ^iid 77 final states. Additional knowledge incorporated in the determination 
of fake rates is described in Appendix A.l. 

The global fit per number of bins is 288.1 / 133 + 27.9, where the last term is the 




(3.2) 
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Figure 3-1: Distribution of observed discrepancy between data and the Standard 
Model prediction, measured in units of standard deviation (a), shown as the sohd 
(green) histogram, before accounting for the trials factor. The left pane shows the 
distribution of discrepancies between the total number of events observed and pre- 
dicted in the 344 populated final states considered. Negative values on the horizontal 
axis correspond to a deficit of data compared to Standard Model prediction; posi- 
tive values indicate an excess of data compared to Standard Model prediction. The 
right pane shows the distribution of discrepancies between the observed and predicted 
shapes in 16,486 kinematic distributions. Distributions in which the shapes of data 
and Standard Model prediction are in relative disagreement correspond to large posi- 
tive a. The solid (black) curves indicate expected distributions, if the data were truly 
drawn from the Standard Model background. Interest is focused on the entries in the 
tails of the left distribution and the high tail of the right distribution. 



contribution to the from the imposed constraints. A per degree of freedom larger 
than unity is expected, since the limited set of correction factors in this correction 
model is not expected to provide a complete description of all features of the data. 
Emphasis is placed on individual outlying discrepancies that may motivate a new 
physics claim, rather than overall goodness of fit. 

Corrections to object identification efficiencies are typically less than 10%; fake 
rates are consistent with an understanding of the underlying physical mechanisms 
responsible; /c-factors range from slightly less than unity to greater than two for some 
processes with multiple jets. All values obtained are physically reasonable. Further 
analysis is provided in Appendix A. 4. 

With the details of the correction model in place, the complete Standard Model 
prediction can be obtained. For each Monte Carlo event after detector simulation. 
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Table 3.2: A subset of the comparison between data and Standard Model prediction, 
showing the most discrepant final states and all final states populated with ten or more 
data events. Final states are labeled according to the number and types of objects 
present, and whether (high Ypt) ot not (low Ypt) the summed scalar transverse 
momentum of all objects in the events exceeds 400 GeV. Final states are ordered 
according to decreasing discrepancy between the total number of events expected, 
taking into account the error from Monte Carlo statistics and the total number ob- 
served in the data. Final states exhibiting mild discrepancies are shown together with 
the significance of the discrepancy in units of standard deviations (cr) after account- 
ing for a trials factor corresponding to the number of final states considered. Final 
states that do not exhibit even mild discrepancies are listed below the horizontal 
line in inverted alphabetical order. Only Monte Carlo statistical uncertainties on the 
background prediction are included. 
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Figure 3-2: The invariant mass of the tau lepton and two leading jets in the final state 
consisting of three jets and one positively or negatively charged tau. (The Vista final 
state naming convention gives the tau lepton a positive charge.) Data are shown as 
filled circles, with the Standard Model prediction shown as the shaded histogram. 
This is the most discrepant kinematic distribution in the final state exhibiting the 
largest population discrepancy. 
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the event weight is multiphed by the value of the luminosity correction factor and 
the fc-factor for the relevant Standard Model process. The single Monte Carlo event 
can be misreconstructed in a number of ways, producing a set of Monte Carlo events 
derived from the original, with weights multiplied by the probability of each misre- 
construction. The weight of each resulting event is multiplied by the probability the 
event satisfies trigger criteria. The resulting Standard Model prediction, corrected as 
just described, is referred to as "the Standard Model prediction" throughout the rest 
of this document, with "corrected" implied in all cases. 

3.2.6 Results 

Data and Standard Model events are partitioned into exclusive final states, depending 
on the combinations of reconstructed final objects. This partitioning is orthogonal, 
with each event ending up in one and only one final state, as shown schematically 
in Fig. 3-3. Data are compared to Standard Model prediction in each final state, 
considering the total number of events observed and predicted, and the shapes of 
relevant kinematic distributions. 

In a data driven search, it is crucial to explicitly account for the trials factor, 
quantifying the number of places where we checked for an interesting signal. Purely 
statistical fluctuations at the level of three or more standard deviations are expected 
to appear, simply because a large number of regions are considered. A reasonably 
rigorous accounting of this trials factor is possible as long as the measures of interest 
and the regions to which these measures are applied are specified a priori, as is done 
here. In this analysis a discrepancy at the level of 3a or greater after accounting for 
the trials factor (typically corresponding to a discrepancy at the level of 5a or greater 
before accounting for the trials factor) is considered "significant." It is worth noting 
that dedicated searches, checking only a small number of signal regions, typically do 
not account for any trials factor, simply because it is very difficult to quantify the 
effect of many people looking for new physics in different ways within the same ex- 
periment. For that reason, instead of a mild 3a, a strong 5a significance is considered 
necessary to discover something new in our field. The assumption made silently is 
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Figure 3-3: ViSTA partitioning in final states. Final states can be viewed as boxes, 
each containing events of one specific final configuration of objects. Final states have 
not been prescribed, but are created automatically as new types of events appear. 
In this way, every event, no matter how exotic, stays within the analysis, in the 
appropriate final state. 

that if one observes a 5a effect in just one attempt, then if one could include somehow 
the trials factor, the actual significance of the observation would turn out to be still 
greater than 3a, therefore convincing. However, in cases where the "new physics" is 
well-expected (like tt or dibosons, which are processes within the Standard Model) 
"discovery" is claimed even with just 3a without considering the trials factor. Cer- 
tainly, for physics beyond the Standard Model, a 3a sans trials factor should not be 
considered convincing proof of existence. 

Discrepancy in the total number of events in a final state (fs) is measured by the 
Poisson probability pfs that the number of predicted events would fiuctuate up to 
or above (or down to or below) the number of events observed. Since the expected 
population is known with some uncertainty, its probability density function is convo- 
luted to obtain pfg. To account for the trials factor due to the 344 Vista final states 
examined, the quantity p = 1 — (1 — Pfs)^^^ is calculated for each final state. The 
result is the probability p of observing a discrepancy corresponding to a probability 
less than p^g in the total sample studied. This probability p can then be converted 
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into units of standard deviations by solving for a such that "^^^"^'^•^ = p ^. A 
final state exhibiting a population discrepancy greater than 3cr after the trials factor 
is thus accounted for is considered significant. 

Many kinematic distributions are considered in each final state, including the 
transverse momentum, pseudorapidity, detector pseudorapidity, and azimuthal an- 
gle of all objects, masses of individual jets and 6-jets, invariant masses of all object 
combinations, transverse masses of object combinations including angular sepa- 
ration A0 and Ai? of all object pairs, and several other more specialized variables. 
A Kolmogorov-Smirnov (KS) test is used to quantify the difference in shape of each 
kinematic distribution between data and Standard Model prediction. As with pop- 
ulations, a trials factor is assessed to account for the 16,486 distributions examined, 
and the resulting probability is converted into units of standard deviations. A distri- 
bution with KS statistic greater than 0.02 and probability corresponding to greater 
than 3(7 after assessing the trials factor is considered significant. 

Table 3.2 shows a subset of the Vista comparison of data to Standard Model pre- 
diction. Shown are all final states containing ten or more data events, with the most 
discrepant final states in population heading the list. After accounting for the trials 
factor, no final state has a statistically significant (> 3cr) population discrepancy. 
The most discrepant final state (3j r^) contains 71 data events and 113.7±3.6 events 
expected from the Standard Model. The Poisson probability for 113.7 ±3.6 expected 
events to result in 71 or fewer events observed in this final state is 2.8 x 10~^, corre- 
sponding to an entry at — 4.03cr in Fig. 3-1. The probability for one or more of the 344 
populated final states considered to display disagreement in population correspond- 
ing to a probability less than 2.8 x 10~^ is 1%. The 3j population discrepancy 
is thus not statistically significant. The most discrepant kinematic distribution in 
this final state is the invariant mass of the tau lepton and the two highest transverse 
momentum jets, shown in Fig. 3-2. 

^Final states for which p > 0.5 after accounting for the trials factor are not even mildly interesting, 
and the corresponding a after accounting for the trials factor is not quoted. For the mildly interesting 
final states with p < 0.5 after accounting for the trials factor, a is quoted as positive if the number of 
observed data events exceeds the Standard Model prediction, and negative if the number of observed 
data events is less than the Standard Model prediction. 
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The six final states with largest population discrepancy are 3jr, 5j, 2jr, 2j2r, 
bej, and the low-px 3j final state, with bej being the only one of these six to exhibit 
an excess of data. The 3j r, 2j r, and 2j 2r final states appear to refiect an incomplete 
understanding of the rate of jets faking taus {p{j ^ t)) as a function of the number 
of jets in the event, at the level of ~ 30% difference between the total number of 
observed and predicted events in the most populated of these final states. The value 
of p{j r) is primarily determined by the j r final state. Interestingly, although 
the underlying physical mechanism for p{j — * e) is very similar to that for p{j r), 
as discussed in Appendix A.l, a significant dependence on the presence of additional 
jets is not observed for p{j e). 

The 5j discrepancy results from a tension with the e4j final state, whose domi- 
nant contribution comes from 5j production convoluted with p{j — * e). The low-p^ 
3j discrepancy results from a tension with the e 2j final state, whose dominant con- 
tribution comes from 3j production convoluted with p{j e). The bej final state is 
predominantly 3j production convoluted with p{j b) and p{j e); this discrep- 
ancy also arises from a tension with the low-p^ 3j and e2j final states. The bej final 
state is the Vista final state in which the largest excess of data over Standard Model 
prediction is seen. The fraction of hypothetical similar CDF experiments that would 
produce a Vista normalization excess as significant as the excess observed in this 
final state is 8%. The 5j, bej, and low-p^- 3j discrepancies correspond to a difference 
of ~ 10% between the total number of observed and predicted events in these final 
states. 

Figure 3-1 summarizes in a histogram the measured discrepancies between data 
and the Standard Model prediction for CDF high-p^ final state populations and 
kinematic distributions. Values in this figure represent individual discrepancies, and 
do not account for the trials factor associated with examining many possibilities. 

Of the 16,484 kinematic distributions considered, 384 distributions are found to 
correspond to a discrepancy greater than 3a after accounting for the trials factor, 
entering with a KS probability of roughly 5a or greater in Fig. 3-1. Of these 384 
discrepant distributions, 312 are attributed to modeling parton radiation, deriving 
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AR(j2,j3) 

Figure 3-4: A shape discrepancy highlighted by Vista in the final state consisting 
of exactly three reconstructed jets with |?7| < 2.5 and px > 17 GeV, and with one of 
the jets satisfying {rjl < 1 and pt > 40 GeV. This distribution illustrates the effect 
underlying most of the ViSTA shape discrepancies. Filled circles show CDF data, with 
the shaded histogram showing the prediction of Pythia. The discrepancy is clearly 
statistically significant, with statistical error bars smaller than the size of the data 
points. The vertical axis shows the number of events per bin, with the horizontal axis 
showing the angular separation {AR = Y/Ar/M-"50^) between the second and third 
jets, where the jets are ordered according to decreasing transverse momentum. In the 
region Ai?(j2, js) ^ 2, populated primarily by initial state radiation, the Standard 
Model prediction can to some extent be adjusted. The region Ai?(j2,j3) ^ 2 is 
dominated by final state radiation, the description of which is constrained by data 
from LEPl. 
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Figure 3-5: The jet mass distribution in the bj final state with Y2pt > 400 GeV. The 
3j Ai?(j2, js) discrepancy illustrated in Fig. 3-4 manifests itself also by producing jets 
more massive in data than predicted by Pythia's showering algorithm. The mass 
of a jet is determined by treating energy deposited in each calorimeter tower as a 
massless 4-vector, summing the 4-vectors of all towers within the jet, and computing 
the mass of the resulting (massive) 4-vector. 
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Figure 3-6: The distribution of AR between the jet and fe-tagged jet in the final state 
bej. The primary Standard Model contribution to this final state is QCD three jet 
production with one jet misreconstructed as an electron. The similarity to the 3j 
Ai?(j2, is) discrepancy illustrated in Fig. 3-4 in the region AR{j, 6) < 2 is clear. Less 
clear is the underlying explanation for the difference with respect to Fig. 3-4 in the 
region AR{j, b) > 2. 
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from the 3j AR{j2^jz) discrepancy shown in Fig. 3-4, with 186 of these 312 shape 
discrepancies pointing out that individual jet masses are larger in data than in the 
prediction, as shown in Fig. 3-5. In the literature, that the same effect was observed 
(but not emphasized) by both CDF [64, 65] and D0 [66] in Tevatron Run I. The 3j 
Ai?(j2, Js) and jet mass discrepancies appear to be two different views of a single un- 
derlying discrepancy, noting that two sufficiently nearby distinct jets correspond to a 
pattern of calorimetric energy deposits similar to a single massive jet. The underlying 
3j AR{j2,j3) discrepancy is manifest in many other final states. The final state bej, 
arising primarily from QCD production of three jets with one misreconstructed as an 
electron, shows a similar discrepancy in AR{j, h) in Fig. 3-6. 

While these discrepancies are clearly statistically significant, basing a new physics 
claim on them would be premature. In the kinematic regime of the discrepancy, 
different algorithms to match exact leading order calculations with a parton shower 
lead to different predictions [67]. Newer predictions have not been systematically 
compared to LEP 1 data, which provide constraints on parton showering reflected 
in Pythia's tuning. Further investigation into obtaining an adequate QCD-based 
description of this discrepancy continues. 

An additional 59 discrepant distributions reflect an inadequate modeling of the 
overall transverse boost of the system. The overall transverse boost of the primary 
physics objects in the event is attributed to two sources: the intrinsic Fermi motion 
of the colliding partons within the proton, and soft or collinear radiation of the 
colliding partons as they approach collision. Together these effects are here referred 
to as "intrinsic /cy," representing an overall momentum kick to the hard scattering. 
Further discussion appears in Appendix A. 2. 3. 

The remaining 13 discrepant distributions are seen to be due to the coarseness of 
the Vista correction model. Most of these discrepancies, which are at the level of 
10% or less when expressed as (data — theory) /theory, arise from modeling most fake 
rates as independent of transverse momentum. 

In summary, this global analysis of the bulk features of the high-p^ data has not 
yielded a discrepancy motivating a new physics claim. There are no statistically sig- 
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nificant population discrepancies in the 344 populated final states considered, and 
although there are several statistically significant discrepancies among the 16,486 
kinematic distributions investigated, the nature of these discrepancies makes it diffi- 
cult to use them to support a new physics claim. 

This global analysis of course can not conclude with certainty that there is no new 
physics hiding in the CDF data. The Vista population and shape statistics may be 
insensitive to a small excess of events appearing at large in a highly populated 

final state. For such signals, different probes are required. Sleuth, and the Bump 
Hunter, which was added in the second round of this analysis, serve this purpose. 

3.3 Sleuth 

Taking a broad view of proposed models that might extend the Standard Model, some- 
thing common is noted: nearly all predict an excess of events at high px, concentrated 
in a particular final state. This feature is exploited by Sleuth [68]. Sleuth is quasi 
model independent, where "quasi" refers to the assumption that the first sign of new 
physics will appear as an excess of events in some final state at large summed scalar 
transverse momentum (Y^Pt)- 

The first version of Sleuth was essentially developed by D0 in Tevatron Run 
I [69, 70, 71], and subsequently improved by HI in HERA Run I [72], with small 
modifications. 

Sleuth relies on the following assumptions for new physics: 

1. The data can be categorized into exclusive final states in such a way that any 
signature of new physics is apt to appear predominantly in one of these final 
states. 

2. New physics will appear with objects at high summed transverse momentum 
(^pr) relative to Standard Model and instrumental background. 

3. New physics will appear as an excess of data over Standard Model and instru- 
mental background. 
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To the extent that the above are true, Sleuth would be more sensitive to a new 
physics signal. 

3.3.1 Algorithm 

The Sleuth algorithm consists of three steps, following the above three assumptions. 
Final states 

In the first step of the algorithm, all events are placed into exclusive final states as in 
Vista, with the following modifications. 

• Jets are identified as pairs, rather than individually, to reduce the total number 
of final states and to keep signal events with one additional radiated gluon 
within the same final state. Final state names include "n jj" if n jet pairs are 
identified, with possibly one unpaired jet assumed to have originated from a 
radiated gluon. 

• The present understanding of quark flavor suggests that b quarks should be pro- 
duced in pairs. Bottom quarks are identified as pairs, rather than individually, 
to increase the robustness of identification and to reduce the total number of 
final states. Final state names include "n bb" if n 6 pairs are identified. 

• Final states related through global charge conjugation are considered to be 
equivalent. Thus e"'"e~7 is a different final state than e^e^7, but e'^e^'j and 
e~e~7 together make up a single Sleuth final state. 

• Final states related through global interchange of the first and second generation 
are considered to be equivalent. Thus e"'"|^7 and fi^^ together make up a single 
Sleuth final state. The decision to treat third generation objects {b quarks and 
r leptons) differently from first and second generation objects reflects theoretical 
prejudice that the third generation may be special, and the experimental ability 
(in the case of b quarks) and experimental challenge (in the case of r leptons) 
in the identification of third generation objects. 
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The symbol i is used to denote electron or muon. The symbol W is used in 
naming final states containing one electron or muon, significant missing momentum, 
and perhaps other non-leptonic objects. Thus the final states e+|^7, e~|^7, /i^|^7, and 
/i~|^7 are combined into the Sleuth final state W^. A table showing the relationship 
between Vista and Sleuth final states is provided in Appendix A. 5.1. 

Summed Transverse Momentum Variable 

The second step of the algorithm considers a single variable in each exclusive final 
state: the summed scalar transverse momentum of all objects in the event (^Pr)- 
Assuming momentum conservation in the plane transverse to the axis of the colliding 
beams. 



where the sum over % represents a sum over all identified objects in the event, the 

-. ^ 

% object has momentum pi, unci denotes the vector sum of all momentum visible 
in the detector but not clustered into an identified object, j) denotes the missing 
momentum, and the equation is a two-component vector equality for the components 
of the momentum along the two spatial directions transverse to the axis of the colliding 
beams. The Sleuth variable ^ is then defined by 



where only the momentum components transverse to the axis of the colliding beams 
are considered when computing magnitudes. 



The algorithm's third step involves searching for regions in which more events are seen 
in the data than expected from Standard Model and instrumental background. This 
search is performed in the variable ^ "Pt defined in the second step of the algorithm, 
for each of the exclusive final states defined in the first step. 




(3.3) 




(3.4) 



Regions 
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The steps of the search can be sketched as follows. 



• In each final state, the regions considered are the one dimensional intervals in 

extending from each data point up to infinity. A region is required to 
contain at least three data events, as described in Appendix A. 5. 

• In a particular final state, the data point with the d^^ largest value of 
defines an interval in the variable "^pr extending from this data point up to 
infinity. This semi-infinite interval contains d data events. The Standard Model 
prediction in this interval, estimated from the Vista comparison, integrates to 
b predicted events. In this final state, the interest of the d^^ region is defined as 
the Poisson probability p-val = Yl'iLd ^^^^ Standard Model background 
h would fluctuate up to or above the observed number of data events d in this 
region. The most interesting region in this final state is the one with smallest 
Poisson probability (p-valmin)- 

• For this final state, pseudo experiments are generated, with pseudo data pulled 
from the Standard Model background. For each pseudo experiment, the interest 
of the most interesting region is calculated. An ensemble of pseudo experiments 
determines the fraction V of pseudo experiments in this final state in which 
the most interesting region is more interesting than the most interesting region 
in this final state observed in the data. Namely, for each final state, V is the 
fraction of pseudo-data distributions, pulled from the Standard Model expecta- 
tion, where p-valmin was smaller than the p-valmin observed in the actual data 
distribution. If there is no new physics in this final state, V is expected to be 
a random number pulled from a uniform distribution in the unit interval^. If 

^ There is a small caveat, for final states with small expected population: We require at least 
3 data in a tail. If d < 3, then p-val = 1 by convention, i.e. the tail is totally uninteresting 

by definition. Apart from p-val = 1, the most uninteresting a tail can possibly be is to have 
exactly c? = 3 and as big a background b as possible. So, the largest p-val attainable for a final 

state with total background 6tot, before we run into p-val = 1, is p-valmax = Si^s %^e~^'°'. I 
will show now that V can not assume values between p-valmax and 1, therefore its distribution 
is not exactly uniform, but has a gap: If the actual p-val,nin were equal to p-valmax, then the 
fraction of pseudo-data distributions which would have p-vali„in > p-valmax would be X]i=o ^^e"*""' , 
because they would be given p-valmin = 1 by convention. The rest of the pseudo-data distributions 
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there is new physics in this final state, V is expected to be small. 

• Looping over all final states, V is computed for each final state. The minimum 
of these values is denoted Pmin- Let TZ be the most interesting region in the 
final state with the smallest V. 

• The interest of the most interesting region TZ in the most interesting final state 
is defined as "P = 1 — Y\ai^ ~ ^a), where the product is over all Sleuth final 
states a, and pa is the lesser of "Pmin and the probability for the total number of 
events predicted by the Standard Model in the final state a to fiuctuate up to or 
above three data events. The quantity V is the fraction of hypothetical similar 
CDF experiments that would produce a final state with V < 'Pmin^-The range 
of V is the unit interval. If the data are distributed according to our Standard 
Model prediction, V is expected to be a random number pulled from a uniform 
distribution in the unit interval, as was also demonstrated experimentally (see 
Fig. 3-7). If new physics is present, V is expected to be small. 

An alternative statistic to V was first implemented in this analysis. The new 
measure of significance, p-val, is the probability that, in a pseudo-experiment, at 
least one tail, in any final state, would have a p-val smaller than the smallest 

p-val found among all tails and all final states in the data. In other words, p-val is 

would have p-valmin < p-valmax, therefore V ~ \ — Si=o ^^e^'''°' = p-valmax- For any actual 
p-valniin < p-valmax, T wiU be even smaller than p-valmax, as it will be more challenging for a pseudo- 
data distribution to exceed that p-valmin- If p-valmin = 1, which has probability X]i=o %^e~'''°', 
then all pseudo-data distributions would be at least as interesting, therefore V ~ \. Therefore, the 
distribution of V has a Kronccker 5 term at 1, multiplied by X)i=o -fre~^'°', and the rest is spread 
at values V < p-valmax- This gap in possible V values shrinks as 6tot S> 3, and practically vanishes 
for 6tot > 10. 

^ This point deserves some explanation to become more obvious. We have final states, and 
we want to find the probability that one or more of them would give a V smaller than the observed 
^min- If the expectated distribution of V were exactly uniform for all N final states, without the gap 
discussed in footnote 5, then each final state would have equal probability Vmin to give V < "Pmin- 
In that simple case, we would just need to define 7^ = 1 — ria(-'^ ^ ^mi") ^ 1 ^ (1 ^^min)^- However, 
depending on the total background fojot- ^ is not distributed exactly uniformly for small final states, 
which gives rise to two possibilities: If for a final state the gap starts at a p-val^ax ^ ^mim then the 
probability that this final state would give V < Vmin is simply Vmin- If, however, 6tot is such that 
p-valmax < 'Pmin, then T^min faUs in the gap, and then that final state has probability X^i^a ^e"'""' 
to return V < p-valmax < "^min, as explained in footnote 5. This complication necessitates the 
introduction of Pa in to treat appropriately the two possible cases. 
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Distribution of P \ Distribution of gaussian map of P | 




Figure 3-7: Distribution of expected values of P in ~ 1000 pseudo-experiments, where 
pseudo-data are pulled from the Standard Model ^ pr distributions. On the right is 
shown the distribution of the same values of V translated into standard deviations 
(a) through the transformation: V = ■ '^^^ expected distribution is 

consistent with a uniform distribution in the interval [0, 1], represented by the black 
curve. 

the probability that in a pseudo-experiment some ^ pt tail would be more significant 
than the globally most significant tail found in the data. The definition of p-val is 

p-val = 1 - JJ (1 - P(a,p-vau„)) ' (3-5) 

a 

where a denotes a final state, 'P(a,p-vaimin) is the probability for final state a to have 
(in a pseudo-experiment) a tail of p-val < p-valmin, and p-valmin is the smallest 

p-val found among all tails in all final states using data. Note that, unlike when 
defining V for a final state a, where p-valmin was the smallest p-val within that final 
state, this p-valmin going into 'P(a,p-vaimin) is the global smallest p-val. Therefore, for a 
final state a, 'P(a,p-vaimin) is not the same as the V defined earlier for each final state, 
because there V was the probability for a final state to exceed in significance its own 
most interesting tail, while 'P(a,p-vaimin) i^ probability for final state a to exceed in 
significance the globally most interesting tail, which may or may not be within a. 

The qualitative difference between p-val and traditional V is that V focusses on 
fluctuations producing a smaller V than the Vmin observed in the data, while p-val 
focusses on fluctuations producing a smaller p-val. The P of a final state depends not 
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only on the significance (p-valmin) of tfie most interesting tail therein, but also on the 
total expected population of the final state where that tail is: A ^ p^- tail of numer- 
ically identical p-valmin, but found in a final state with larger expected background, 
results into larger P, because bigger population means more pseudo-data, hence more 
tails, hence more chances to have p-val < p-val^i^. So, V is not a measure of 
the significance of a tail per se, but rather of a whole ^ pt distribution. Whether to 
use V or p-val is a matter of preference, p-val is more intuitive, because it quantifies 
the significance of X]Pt tails, which are fundamentally the features Sleuth detects, 
while V quantifies the significance of whole "^Pr distributions from the view-point 
of their own "^pr excesses. Since V was invented first and has been part of Sleuth 
since its conception, its use was continued in this analysis. 

Output 

The output of the algorithm is the most interesting region TZ observed in the final state 
with the smallest V, and a number V quantifying the interest of TZ^ . A reasonable 
threshold for discovery is P < 0.001, which corresponds loosely to a local 5a effect 
after the trials factor is accounted for^. 

Although no integration over systematic errors is performed in computing V, 
systematic uncertainties do affect the final Sleuth result. If Sleuth highlights a 
discrepancy in a particular final state, explanations in terms of a correction to the 
background estimate are considered. This process necessarily requires physics judge- 
ment. A reasonable explanation of a Sleuth discrepancy in terms of an inadequacy 
in the modeling of the detector response or Standard Model prediction that is consis- 
tent with external information is fed back into the Vista correction model and tested 
for global consistency. In this way, plausible explanations for discrepancies observed 
by Sleuth are incorporated into the Vista correction model. This iteration con- 

^If Sleuth used p-val instead of P, then the most interesting tail TZ would be the one with the 
globally smallest p-val. That region may happen to be the same with the most interesting region 
within the final state with of smallest V, but it doesn't have to. 

^That is empirically confirmed in sensitivity tests (Sec. 3.3.2), where it was observed that the V 

2 

discovery threshold is met approximately at the same time when p-valmin — "^fe^"^' 
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tinues until either all reasonable explanations for a significant Sleuth discrepancy 
are exhausted, resulting in a possible new physics claim, or no significant Sleuth 
discrepancy remains. 

3.3.2 Sensitivity 

Two important questions must be asked: 

• Will Sleuth find nothing if there is nothing to be found? 

• Will Sleuth find something if there is something to be found? 

If there is nothing to be found. Sleuth will find nothing 999 times out of 1000, 
given a uniform distribution of V and a discovery threshold of "P < 0.001. The uniform 
distribution of V in the absence of new physics is illustrated in Fig. 3-7. Sleuth 
will of course return spurious signals if provided improperly modeled backgrounds. 
The algorithm directly addresses the issue of whether an observed hint is due to a 
statistical fluctuation. Sleuth itself is unable to address systematic mismeasurement 
or incorrect modeling, but is useful in bringing these to attention. 

The answer to the second question depends on the degree to which the new physics 
satisfies the three assumptions on which Sleuth is based: new physics will appear 
predominantly in one final state, at high summed scalar transverse momentum, and 
as an excess of data over Standard Model prediction. 

Known Standard Model processes 

Consideration of specific Standard Model processes can provide intuition for Sleuth's 
sensitivity to new physics. This section tests Sleuth's sensitivity to the production 
of top quark pairs, W boson pairs, single top, and the Higgs boson. 

Top quark pairs. Top quark pair production results in two h jets and two W 
bosons, each of which may decay leptonically or hadronically. The W branching ratios 
are such that this signal predominantly populates the Sleuth final state Wbbjj, 
where ' W denotes an electron or muon and significant missing momentum. Although 
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Figure 3-8: (Top left) The Sleuth final state bbi^i'"^, consisting of events with one 
electron and one muon of opposite sign, missing momentum, and two or three jets, 
one or two of which are 6-tagged. Data corresponding to 927 pb~^ are shown as filled 
circles; the Standard Model prediction is shown as the shaded histogram. (Top right) 
The same final state with tt subtracted from the Standard Model prediction. (Bot- 
tom row) The SLEUTH final state Wbbjj, with the Standard Model tt contribution 
included (lower left) and removed (lower right). Significant discrepancies far surpass- 
ing Sleuth's discovery threshold are observed in these final states with tt removed 
from the Standard Model background estimate. 
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Figure 3-9: Sleuth's "P as a function of assumed integrated luminosity, with top 
quark pair production removed from the Standard Model background estimate. The 
horizontal axis shows integrated luminosity, in units of pb~^. The vertical axis shows 
Sleuth's V. With Standard Model tt production omitted from the background esti- 
mate and actual data including ti production. Sleuth's V decreases with increasing 
integrated luminosity, shown as the solid (green) line, crossing at roughly 80 pb~^ the 
discovery threshold of "P < 0.001, shown as the horizontal dashed (gray) line. The 
shaded (yellow) band shows the range of values of V obtained in a number of trials, 
with the width of the band resulting from the statistical fluctuations of individual 
top quark events. 
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Model Description 



Sensitivity 



GMSB, A = 82.6 GeV, 
tan/3 = 15, /i > 0, with one 
messenger of M = 2A. 
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250 GeV, with standard 
model couplings to leptons. 









1 
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Z' ^ qq, mz' = 700 GeV, 
with standard model cou- 
plings to quarks. 
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Z' qq, mz' = 1 TeV, with 
standard model couplings to 
quarks. 
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with standard model cou- 
plings to tt. 

Table 3.3: Summary of Sleuth's sensitivity to several new physics models, expressed 
in terms of the minimum production cross section needed for discovery with 927 pb~^. 
Where available, a comparison is made to the sensitivity of a dedicated search for this 
model. The solid (red) box represents Sleuth's sensitivity, and the open (white) box 
represents the sensitivity of the dedicated analysis. Systematic uncertainties are not 
included in the sensitivity calculation. The width of each box shows typical variation 
under fluctuation of data statistics. In Models 3 and 4, there is no targeted analysis 
available for comparison. 
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Figure 3-10: (Left) The final state consisting of events with an electron and 

muon of opposite sign and missing transverse momentum, in 927 pb~^ of CDF data. 
(Right) The same final state with Standard Model WW, WZ, and ZZ contributions 
subtracted, and with the correction factors re-fit in the absence of these contributions. 
Sleuth finds the final state to contain a discrepancy surpassing the discovery 

threshold of V < 0.001 with the processes WW, WZ, and ZZ removed from the 
Standard Model background. 



the final states i^i~fihh were important in verifying the top quark pair production 
hypothesis in the initial observation by CDF [73] and D0 [74] in 1995, most of the 
statistical power came from the final state Wbbjj. The fully hadronic decay into bb4j 
has only convincingly been seen after integrating substantial Run II luminosity [75]. 
Sleuth's first assumption that new physics will appear predominantly in one final 
state is thus reasonably well satisfied. Since the top quark has a mass of 170.9 ± 
1.8 GeV [76], the production of two such objects leads to a signal at large "^Pr 
relative to the Standard Model background of W bosons produced in association 
with jets, satisfying Sleuth's second and third assumptions. Sleuth is expected to 
perform reasonably well on this example. 

To quantitatively test Sleuth's sensitivity to top quark pair production, this 
process is removed from the Standard Model prediction, and the correction factors 
are re-obtained from a global fit assuming ignorance of tt production. Sleuth easily 
discovers tt production in 927 pb"^ in the final states bbi'^i'"'^ and Wbbjj, shown in 
Fig. 3-8. Sleuth finds Vhii+e-^ < 1.5 x 10~® and ViY^ijj < 8.3 x 10^'^, far surpassing 
the discovery threshold of P < 0.001. 
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The test is repeated as a function of assumed integrated luminosity (Fig. 3-9), 
and Sleuth is found to highlight the top quark signal at an integrated luminosity 
of roughly 80 ± 60 pb~^, where the large variation arises from statistical fluctuations 
in the ti signal events. Weaker constraints on the Vista correction factors at lower 
integrated luminosity marginally increase the integrated luminosity required to claim 
a discovery. 

W boson pairs. The sensitivity to Standard Model WW production is tested by 
removing this process from the Standard Model background prediction and allowing 
the Vista correction factors to be re-fit. In 927 pb~^ of Tevatron Run II data, Sleuth 
identifies an excess in the final state consisting of an electron and muon of 

opposite sign and missing momentum. This excess corresponds to P < 2 x 10"^, 
sufficient for the discovery of WW, as shown in Fig. 3-10. 

Single top. Single top quarks are produced weakly, either through a t-channel 
process like hu ^ td ^ Wb + jet, or through a s-channel, such as ud W'^ 
tb — > Wbb. Both of these final states are merged into Sleuth's Wbb final state, 
satisfying Sleuth's first assumption. Single top production will appear as an excess 
of events, satisfying Sleuth's third assumption. Sleuth's second assumption is 
not well satisfied for this example, since single top production does not lie at large 
J2pt relative to other Standard Model processes. Sleuth is thus expected to be 
outperformed by a targeted search in this example. 

Higgs boson. Assuming a Standard Model Higgs boson of mass rrih = 115 GeV, the 
dominant observable production mechanism is pp Wh and pp Zh, populating 
the final states Wbb, i~^i~bb, and ]^bb. The signal is thus spread over three Sleuth 
final states. Events in the last of these (j^bb) do not pass the Vista event selection, 
which does not use |^ as a trigger object. Sleuth's first assumption is thus poorly 
satisfied for this example. The Standard Model Higgs boson signal will appear as an 
excess, but as in the case of single top production it does not appear at particularly 
large relative to other Standard Model processes. Since the Standard Model 
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Higgs boson poorly satisfies Sleuth's first and second assumptions, a targeted search 
for tliis specific signal is expected to outperform Sleuth. 

Specific models of new physics 

To build intuition for Sleuth's sensitivity to new physics signals, several sensitivity 
tests are conducted for a variety of new physics possibilities. Some of the new physics 
models chosen have already been considered by more specialized analyses within CDF, 
making possible a comparison between Sleuth's sensitivity and the sensitivity of 
these previous analyses. 

Sleuth's sensitivity can be compared to that of a dedicated search by determining 
the minimum new physics cross section CTmin required for a discovery by each. The 
discovery for SLEUTH occurs when V < 0.001. In most SLEUTH regions satisfying the 
discovery threshold oi V < 0.001, the probability for the predicted number of events 
to fluctuate up to or above the number of events observed corresponds to greater than 
5(7. The discovery for the dedicated search occurs when the observed excess of data 
corresponds to a 5a effect. Smaller (Xmin corresponds to greater sensitivity. 

The sensitivity tests are performed by first generating pseudo data from the Stan- 
dard Model background prediction. Signal events for the new physics model are 
generated, passed through the chain of CDF detector simulation and event recon- 
struction, and consecutively added to the pseudo data until Sleuth finds V < 0.001. 
The number of signal events needed to trigger discovery is used to calculate amm- 

For each dedicated analysis to which Sleuth is compared, the number of Stan- 
dard Model events expected in 927 pb~^ within the region targeted is used to calculate 
the number of signal events required in that region to produce a discrepancy corre- 
sponding to 5cr. Using the signal efficiency determined in the dedicated analysis, (Jmin 
is calculated. The effect of systematic uncertainties is not included in Sleuth, so it 
is also removed from the dedicated analyses. 

The results of five such sensitivity tests are summarized in Table 3.3. Sleuth is 
seen to perform comparably to targeted analyses on models satisfying the assumptions 
on which SLEUTH is based. For models in which Sleuth's simple use of YIpt can be 
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Figure 3-11: Blue points: The V distribution observed in 927 pb~^, with one entry 
for each of the 72 Sleuth final states with at least 3 data. There are 131 Sleuth 
final states with non-zero background and less than 3 data, which are assigned V = 1. 
Black histogram: The expected V distribution from all 203 Sleuth final states with 
non-zero background, if instead of actual data we use pseudo-data pulled from the 
expected X]Pt distribution of each final state, and omit the final states where pseudo- 
data are less than 3 and therefore have V = 1. As explained in Sec. 3.3.1, footnote 5, 
the V of final states with expected population < 10 is not uniformly distributed. 
Of the 203 final states Sleuth considers in 927 pb"\ 150 have Standard Model 
background of less than 10 events, which causes the expected V distribution to slightly 
favor smaller values. 



improved upon by optimizing for a specific feature, a targeted search may be expected 
to achieve greater sensitivity. One of the important features of Sleuth is that it not 
only performs reasonably well, but that it does so broadly. In Model 1, a search 
for a particular model point in a gauge mediated supersymmetry breaking (GMSB) 
scenario. Sleuth gains an advantage by exploiting a final state not considered in the 
targeted analysis [77]. In Model 2, a search for a Z' decaying to lepton pairs, the 
targeted analysis [78] exploits the narrow resonance in the e~^e~ invariant mass. In 
Models 3 and 4, which are searches for a hadronically decaying Z' of different masses, 
there is no targeted analysis against which to compare. In Model 5, a search for a 
Z' — > ti resonance, the signal appears at large summed scalar transverse momentum 
in a particular final state, resulting in comparable sensitivity between Sleuth and 
the targeted analysis [79]. 
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Figure 3-12: The most interesting final states identified by Sleuth. The region cho- 
sen by Sleuth, extending up to infinity, is shown by the (blue) arrow just below the 
horizontal axis. Data are shown as filled circles, and the Standard Model prediction 
is shown as the shaded histogram. The Sleuth final state is labeled in the upper 
left corner of each panel, with i denoting e or fi, and denoting an electron and 

muon with the same electric charge. The number at upper right in each panel shows 
V, defined in Sec. 3.3.1. The inset in each panel shows an enlargement of the region 
selected by Sleuth, together with the number of events (SM) predicted by the Stan- 
dard Model in this region, and the number of data events {d) observed in this region. 
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3.3.3 Results 



The distribution of V for the final states considered by Sleuth in the data is shown 
in Fig. 3-11. The concavity of this distribution reflects the degree to which the 
correction model described in Sec. 3.2.5 has been tuned. A crude correction model 
tends to produce a distribution that is concave upwards, as seen in this figure, while 
an overly tuned correction model produces a distribution that is concave downwards, 
with more final states than expected having V near the midpoint of the unit interval. 

The most interesting final states identified by Sleuth are shown in Fig. 3-12, 
together with a quantitative measure {V) of the interest of the most interesting region 
in each final state, determined as described in Sec. 3.3.1. The legends of Fig. 3-12 
show the primary contributing Standard Model processes in each of these final states, 
together with the fractional contribution of each. The top six final states, which 
correspond to entries in the leftmost bin in Fig. 3-11. span a range of populations, 
relevant physics objects, and important background contributions. This picture is 
suggestive of statistical fluctuations, spread among unrelated final states. 

The final state 66, consisting of two or three reconstructed jets, one or two of 
which are 6-tagged, heads the list. These events enter the analysis by satisfying 
the Vista offline selection requiring one or more jets or 6-jets with > 200 GeV. 
The definition of Sleuth's YIpt variable is such that all events in this final state 
consequently have J2pt > 400 GeV. Sleuth chooses the region J2pt > 469 GeV, 
which includes nearly 10^ data events. The Standard Model prediction in this region 
is sensitive to the 6-tagging efficiency p{b b) and the fake rate p{j b), which have 
few strong constraints on their values for jets with px > 200 GeV other than those 
imposed by other Vista kinematic distributions within this and a few other related 
final states. For this region Sleuth finds = 0.0055, which is unfortunately not 
statistically significant after accounting for the trials factor associated with looking 
in many different final states, as discussed below. 

The final state j^, consisting of events with one reconstructed jet and signifi- 
cant missing transverse momentum, is the second final state identified by Sleuth. 
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The primary background is due to non-collision processes, including cosmic rays and 
beam halo backgrounds, whose estimation is discussed in Appendix A. 2.1. Since 
the hadronic energy is not required to be deposited in time with the beam crossing. 
Sleuth's analysis of this final state is sensitive to particles with a lifetime between 
1 ns and 1 /is that lodge temporarily in the hadronic calorimeter, complementing 
Ref. [80]. 

The final states i'^i'^^jj, t^i'^j)^ and t^^^ all contain an electron {€) and muon 
(^') with identical reconstructed charge (either both positive or both negative). The 
final states with and without missing transverse momentum are qualitatively different 
in terms of the Standard Model processes contributing to the background estimate, 
with the final state t^V~ composed mostly of dijets where one jet is misreconstructed 
as an electron and a second jet is misreconstructed as a muon; Z t^t^ ^ where 
one tau decays to a muon and the other to a leading tt", one of the two photons from 
which converts while traveling through the silicon support structure to result in an 
electron reconstructed with the same sign as the muon, as described in Appendix A.l; 
and Z in which a photon is produced, converts, and is misreconstructed as 

an electron. The final states containing missing transverse momentum are dominated 
by the production of W{-^ fiu) in association with one or more jets, with one of the 
jets misreconstructed as an electron. The muon is significantly more likely than the 
electron to have been produced in the hard interaction, since the fake rate p{j fi) 
is roughly an order of magnitude smaller than the fake rate p{j — > e), as observed in 
Table 4.2. The final state which contains two or three reconstructed jets 

in addition to the electron, muon, and missing transverse momentum, also has some 
contribution from WZ and top quark pair production. 

The final state contains one reconstructed tau, significant missing transverse 
momentum, and one reconstructed jet with pj- > 200 GeV. This final state in prin- 
ciple also contains events with one reconstructed tau, significant missing transverse 
momentum, and zero reconstructed jets, but such events do not satisfy the offline 
selection criteria described in Sec. 3.2.2. Roughly half of the background is non- 
collision, in which two different cosmic ray muons (presumably from the same cosmic 
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ray shower) leave two distinct energy deposits in the CDF hadronic calorimeter, one 
with pt > 200 GeV, and one with a single associated track from a pp collision occur- 
ring during the same bunch crossing. Less than a single event is predicted from this 
non-collision source (using techniques described in Appendix A. 2.1) over the past five 
years of Tevatron running. 

In these CDF data, Sleuth finds V = 0.46. The fraction of hypothetical similar 
CDF experiments (assuming a fixed Standard Model prediction, detector simulation, 
and correction model) that would exhibit a final state with V smaller than the smallest 
V observed in the CDF Run II data is approximately 46%. The actual value obtained 
for V is not of particular interest, except to note that this value is significantly greater 
than the threshold of < 0.001 required to claim an effect of statistical significance. 
Sleuth has not revealed a discrepancy of sufficient statistical significance to justify 
a new physics claim. ^ 

Systematics are incorporated into Sleuth in the form of the flexibility in the 
Vista correction model, as described previously. This flexibility is significantly more 
important in practice than the uncertainties on particular correction factor values 
obtained from the fit. The inclusion of additional systematic uncertainties would not 
qualitatively change the conclusion that Sleuth has not revealed a discrepancy of 
sufficient statistical significance to justify a new physics claim. 

Starting from the current result of SLEUTH in 927 pb~^, a projection (Fig. 3-13) 
shows that, if the dataset roughly doubles and nothing changes in the Standard Model 
implementation, then V will likely be smaller than discovery threshold. This implies 
that, either we are on the verge of a discovery that will happen with more data, or a 
doubling of data will likely enforce some more accurate modeling of Standard Model 
backgrounds, which will possibly increase V away from its predicted small value. This 
clue was the main motivation to repeat and improve this search with more data, as 
will be described in a later chapter. 

^The alternative statistic, p-val, was found to be 22%. The region with the smallest p-val is in 
the final state bb, which also has the smallest V. Therefore, the most interesting region pointed by 
both statistics is the same: J2pt > 469 in bb. 
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CDF Run II preliminary (921 pb"') 

P = 
10"^ = 

10-^ r 

10"^ P 

iQ-^ r ' ' ' II 
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L (1b-^) 

Figure 3-13: Projection of V towards lower and higher luminosities, starting from 
927 pb~^. Values were obtained by scaling down or up both data and backgrounds. 
The yellow band reflects uncertainty due to randomness in which of the present data 
events would have appeared in less data, or would recur in more. The Standard Model 
implementation is assumed invariant in all except total populations. 

3.4 Summary of first round with 1 fb~^ 

In the first round of this analysis, with 927 pb~^, a complete Standard Model back- 
ground estimate has been obtained and compared with data in 344 populated ex- 
clusive final states and 16,486 relevant kinematic distributions. Consideration of 
exclusive final state populations yields no statistically significant (> 3cr) discrepancy 
after the trials factor is accounted for. Quantifying the difference in shape of kine- 
matic distributions using the Kolmogorov-Smirnov statistic, significant discrepancies 
are observed between data and Standard Model prediction. These discrepancies are 
believed to arise from mismodeling of the parton shower and intrinsic /cy, and repre- 
sent observables for which a QCD-based understanding is highly motivated. None of 
the shape discrepancies highlighted motivates a new physics claim. 

A further systematic search (Sleuth) for regions of excess on the high-^p^ tails 
of exclusive final states has been performed, representing a quasi-model-independent 
search for new electroweak scale physics. A measure of interest rigorously accounting 
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for the trials factor associated with looking in many regions with few events is defined, 
and used to quantify the most interesting region observed in the CDF Run II data. 
No region of excess on the high-^p^ tail of any of the Sleuth exclusive final states 
surpasses the discovery threshold. 

Although this result of course can not prove that no new physics is hiding in the 
studied data, this search is the most encompassing test of the Standard Model at the 
energy frontier. 
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Chapter 4 
Update with 2 fb~ 



This analysis was conducted in two rounds: first with 1 fb~ of data, and then with 
2 fb~^. The first round was presented in Chapter 3. This chapter summarizes the 
second round. 

4.1 Overview 

Four separate statistics are employed to search for evidence of new physics. These 
statistics are 

• a difference between the number of observed and predicted events in individual 
exclusive final states; 

• a difference in distribution shape between data and Standard Model prediction 
in a variety of kinematic variables; 

• an excess of data in the large YIpt tail of exclusive final states; and 

• a local excess (bump) in some invariant mass distribution, reflecting possibly a 
new resonance. 

The next sections discus these statistics: Sec. 4.2 is about the normalization and 
shape statistics. Sec. 4.3 about the X]Pt statistic, and Sec. 4.4 about the mass bump 
statistic. Conclusions are provided in Sec. 4.5. 
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4.2 Vista 



Conceptually, Vista in the second round of analysis is the same as in the first. 
4.2.1 Object identification 

The particle identification criteria used in this analysis are the same as in the first 
round, except for the following changes: 

• Changed previously suboptimal conversion filter to the standard one. In the 
previous version, we required each lepton candidate to not have within AR < 0.4 
another track of opposite sign. The neighbor track was counted only if it had 
Pt > 2 GeV. In this version, we make no transverse momentum requirement 
on the candidate neighbor tracks. This change reduces significantly the rate for 
jets and photons to fake electrons, since both fakings involve conversions. 

• For plug electrons we now require the presence of a good quality PES cluster^, 
and that the PHX track matches to the electromagnetic cluster to within AR < 
0.01. This reduces the rate of jets faking electrons in the region |?7dct| > 1- 

• For CMUP muons, we require CMU the distance between a stub and the track 
extrapolation (AX) to be less than 7 cm, instead of 3 cm. This follows a change 
in the standard muon identification criteria used by the experiment. 

• For taus, the momentum is now taken from the calorimeter Et rather than 
visible momentum (track momentum plus vr^'s). The minimum seed track pt 
requirement has been increased to 10.5 GeV, reflecting a change in online trig- 
ger criteria. We also added an additional muon veto cut requiring that the 
calorimter over seed track p^ be greater than 0.5, inconsistent with a mini- 
mum ionizing particle. 

• For plug photons, we apply the fiducial cut |?7dct| > 1-2. 

Tables with identification criteria for all objects can be found in Appendix B.2. 
^Variables PES 5x9 U and PES 5x9 V need to be defined and less than 0.65. 
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4.2.2 Event selection 

The following criteria are used to keep events of interest. Single-object criteria accept 
events containing: 

• a central electron with pr > 25 GeV, or 

• a plug electron with px > 40 GeV, or 

• a central muon with px > 25 GeV, or 

• a central photon with px > 60 GeV, or 

• a central or plug photon with p^ > 300 GeV, or 

• a central jet or 6-tagged jet with pt > 200 GeV, or 

• a central 6-tagged jet with px > 60 GeV (prescaled by the online jet20 trigger), 
or 

• a central jet or 6-tagged jet with px > 40 GeV (prescaled by 10 in addition to 
the online jet20 trigger prescale). 

Di-object criteria keep events containing: 

• one electron plus one electron or photon with |?7| < 2.5 and pr > 25 GeV, or 

• a central or plug electron with pj- > 40 GeV and a central tau with px > 17 GeV, 
or 

• a central muon with pt > 17 GeV and a central or plug photon with pt > 
25 GeV, or 

• a central muon with px > 25 GeV and a central fe-tagged jet with px > 17 GeV, 
or 

• two taus with |?7| < 1.0 and px > 25 GeV, or 

• a central or plug photon with px > 40 GeV and a central tau with px > 40 GeV, 
or 
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• one central photon with pt > 25 GeV and one other central or plug photon 
with Pt > 25 GeV, or 

• a central photon with pt > 40 GeV and a central fe-tagged jet with px > 25 GeV, 
or 

• a central jet or 6-tagged jet with px > 40 GeV and a central tau with px > 
17 GeV (prescaled by the online jet20 trigger), or 

• a central jet with pt > 60 GeV and a central 6-tagged jet with pt > 25 GeV 
(prescaled by the onhne jet20 trigger), or 

• two central muons with pt > 17 GeV, or 

• one central electron and one central muon with p^ > 17 GeV, or 

• one central electron with px > 20 GeV and one central tau with px > 17 GeV, 
or 

• one plug electron with pr > 25 GeV and one central muon with pr > 17 GeV, 
or 

• one central muon with pt > 20 GeV and one central tau with px > 17 GeV. 
Tri-object criteria keep events containing: 

• a central or plug photon with px > 40 GeV and two central taus with px > 
17 GeV, or 

• a central or plug photon with px > 40 GeV and two central 6-tagged jets with 
Px > 25 GeV, or 

• a central or plug photon with px > 40 GeV, a central tau with px > 25 GeV, 
and a central 6-tagged jet with px > 25 GeV, or 

• one 6-tagged jet with px > 90 GeV and two more 6-tagged jets with px > 
60 GeV, or 
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• one central muon with > 17 GeV and two other central or plug muons with 
PT>n GeV. 

Additional special criteria accept events containing: 

• one central or plug electron with > 40 GeV, missing transverse momentum 
greater than 17 GeV, and two or more jets or central 6-tagged jets with pj- > 
17 GeV, or 

• one central muon with pt > 25 GeV, missing transverse momentum greater 
than 17 GeV, and two or more jets or central 6-tagged jets with pt > 17 GeV. 

The above criteria are set by the requirements that the corresponding Standard 
Model prediction can be generated with enough Monte Carlo event to have weights 
< 1, and that trigger efficiencies can be treated as roughly independent of object pt, 
while keeping as many potentially interesting events as possible. 

Explicit online trigger paths are no longer required. CDF specific details are 
provided in Sec. B.l. 

4.2.3 Event generation 

Here are summarized changes made to our Monte Carlo event generation since the 
first round of analysis. 

• A number of electroweak samples changed to use the newest (Gen6) CDFsiM 
version. They include (the Stntuple sample names are given in parentheses): 
Pythia W eu (weOsfe, weOsge, weOshe), Pythia W ^ fiu (weOsSm, we0s9m), 
Pythia W tu (we0s9t, weOsat), Pythia Z ee (zels6d, zelsad, zeOscd, 
zeOsdd, zeOsed, zeOsee), Pythia Z yU/i (zels9m, zeOsbm, zeOscm, zeOsdm, 
zeOsem), Pythia Z ^ tt (zeOsSt, zeOsat), Pythia WW (weOsbd, weOsgd), Pythia 
WZ (weOscd), Baur W{-^ eu)+-f (re0s28, re0s48), Baur /^i^) +7 (re0s29, 
re0s49), Baur W{-^ tv) + 7 (reOsla, re0s4a). 

• A low mass Drell-Yan sample was added with Mz going down to 10 GeV (zxOsde, 
zxOsdm) 
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• We switched from using the Mrenna matched W+jets sample to the standard 
Top Group Alpgen W+jets samples: ei/)+jets (ptopwO, ptopwl, ptop2w, 
ptopSw, ptop4w), fiiy)+}ets (ptopwS, ptopw6, ptopTw, ptopSw, ptop9w), 
W{—>- ri^)+jets (utopwO, utopwl, utop2w, utop3w, utop4w). 

• We switched from using MadEvent W+bbar to the standard Top Group W+bbar 
sample: ez/)+bb+jets (btopOw, btoplw, btop2w), /iz/)+bb+jets 
(btopSw, btop6w, btopTw). 

Table 4.1 summarizes the contributions from each Monte Carlo sample. 
Specific modifications to the correction model implemented since the first round 
are described here. 

• The integrated luminosity of the data sample considered has increased from 927 
to 1990 pb^^. The integrated luminosity correction factor has been adjusted 
accordingly. 

• Events from more recent data have been included in the high-p-p jet and photon 
non-collision backgrounds. For events with YIpt > 400 GeV and at least two 
jets of pt > 10 GeV and no objects of other kinds, we require the pt of the 
jet with the second largest pr to be greater than 75 GeV. This cut is to clean 
multijet samples of events where the second jet comes from the underlying event 
but the first jet is due to a cosmic ray. Such events are not modeled well by 
our cosmic background, which comprises events required to have less than three 
tracks; this requirement reduces the fraction of such cosmic + jet(s) events 
relative to the data sample, where more than three tracks are required. As a 
result of these changes, the cosmic_ph and cosniic_j correction factors have 
been readjusted. 

• It was recognized that in the previous version of the analysis we had been using a 
suboptimal filter for conversion electrons. This filter has been updated and now 
yields a substantially reduced rate for jets faking electrons via fragmentation to 
a leading tt". 
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Table 4.1: The number of events contributing from each Standard Model process, 
ordered according to decreasing effective weight of individual Monte Carlo events. 
The data set names are shown in the leftmost column, with the corresponding process 
shown in the second column. The typical weight of individual events from each process 
is shown in the third column, and the "effective" number of events from each process 
contributing to the background estimate is shown in the fourth column. The weight 
from each process is totaled in the rightmost column, and the total weight is provided 
at bottom. The total weight is equal to the roughly four million events included in 
this analysis. 
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Code 


Category 


Explanation 


Value 
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Error(%) 


0001 


luminosity 


CDF integrated luminosity 
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50 


2.6 
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fc-factor 
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0.7 


0029 


misId 


p(Ai">/i) CMUP-fCMX 


0.888 


0.007 


0.8 


0030 


misid 


p(7^7) central 


0.949 


0.018 


1.9 


0031 


misid 


pf7 — S'T) plug 


0.859 


0.016 


1.9 


0032 


misid 


p(b— >b) central 


0.978 


0.021 


2.1 


0033 


misid 


p(7^e) pluK 

r \ 1 J sr o 


0.06 


0.003 


5.0 


0034 


misid 


p(q^e) central 


7.09x10"^ 


1.9x10"^ 


2.7 


0035 


misid 


p(q— >c) plug 


0.000766 


1.2x10"^ 


1.6 


0036 


misid 


p(q— 


1.14x10"^ 


6x10-^ 


5.2 


0037 


misid 


p(b— >u) 


3.3x10-^ 


1.1x10^5 


33.0 


0038 


misid 


p(j^b) 25<pT 


0.0183 


0.0002 


1.1 


0039 


misid 


p(q— >t) 


0.0052 


0.0001 


1.9 


0040 


misid 


p(q— >7) central 


0.000266 


1.4x10"^ 


5.3 


0041 


misid 


p(q^7) plug 


0.00048 


6x10-5 


12.6 


0042 


trigger 


p(e^trig) plug, pT>25 


0.86 


0.007 


0.8 


0043 


trigger 


p(/^^trig) CMUP-f-CMX, pT>25 


0.916 


0.004 


0.4 



Table 4.2: The correction factors of ViSTA correction model. The best fit values 
(Value) are given in the 4**^ column. Correction factor errors (Error) resulting from 
the fit are shown in the 5"^ column. The fractional error (Error (°/o)) is listed in the 
6*^ column. All values are dimensionless except for the first one, which represents 
integrated luminosity and has units of pb"^. These values and uncertainties are valid 
within the context of this correction model. 
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In order to accommodate the ditau trigger, which in recent data requires a seed 
track with > 10 GeV, and recognizing our concentration on the identifica- 
tion of single-prong taus, the track requirement for taus has been increased to 
10.5 GeV. The fake rate p{j r) and its dependence on px have been adjusted 
accordingly. 

In order to address questions regarding the fake rate p{j — > r) and its consistent 
simultaneous application to many final states, the measurement of tau px is now 
based on the energy deposited in the calorimeter. 

In order to address questions regarding the fake rate p{j — r) and its consistent 
simultaneous application to multijet final states with large and small 'Y^pr-, a 
monotonically decreasing dependence of the fake rate p{j r) on the generated 
summed scalar transverse momentum has been imposed. 

In the implementation of the fake rates p{j — > e), p{j fi), and p{j r), jets 
from a parent u ot d quark now only fake positively charged /i and r leptons 
(and positrons rather than electrons at a ratio of 2:1), and jets from a parent 
u OT d quark now only fake negatively charged fi and r leptons (and electrons 
rather than positrons at a ratio of 2:1). 

The ditau trigger, which turned on roughly 300 pb~^ into Run II, has now been 
live for a greater fraction of the total integrated dataset. The effective ditau 
trigger effeciency has been adjusted accordingly. 

A fake rate p{b fi) has been introduced. 

The pt dependence of the fake rate p{j h) and p(j tau) has been adjusted. 

The ?7dct and (f) dependence of the fake rate p{j e) and p{j —>■ ph) has been 
adjusted to take into account more geometric features of the detector including 
the calorimeter cracks at r/det of and 1.1. 

The efficiency for reconstructing a jet as a non-6-tagged jet has been reduced 
from 1 to l-p{j — > h). 
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CDF Run II Preliininary (2 fb' ) 



CDF Run II Preliminary (2 fb"') 



Entries: 399 



3500 



Entries: 19650 





-4 



-2 2 



4 



6 



10 



Figure 4-1: Distribution of discrepancy (before accounting trials factor) between data 
and Standard Model prediction, measured in units of standard deviation (a). The 
left pane shows the distribution of discrepancies between the total number of events 
observed and predicted in the final states considered. Final states with data ex- 
cess populate the right tail, while those with data deficit populate the left tail. The 
right pane shows the distribution of discrepancies between the observed and predicted 
shapes of roughly 2 x 10^ kinematic distributions. Distributions in agreement corre- 
spond to small or negative o", and distributions in disagreement correspond to large 
positive a. Interest is in the entries in both tails of the distribution on the left, and 
in the right tail of the distribution on the right. 

• Separate /c-factors have been introduced for heavy flavor multijet production for 
the high-p'T sample. Specifically, a new /c-factor has been introduced for events 
with at least one heavy flavor jet and three jets in total, with > 150 GeV. 
Another /c-factor has been introduced for events with at least one heavy flavor 
jet and four or more jets in total, with pr > 150 GeV. They are listed in the 
table of correction factors 4.2 as lb2j and lb3j. 



4.2.4 Results 

The global fit x^? described in Sec. 3.2.5, was in the second round 784.43, from 335 
bins, plus a 28.4 from external constraints. It is obviously a very large x^? even more 
unlikely than it was in the first round of the analysis, indicating that deviations from 
the fit are clearly non-statistical, but due to systematic imperfections in our Standard 
Model implementation. Higher statistics exacerbate systematic imperfections. 

Table 4.3 shows the comparison of CDF Run II 2 fb~^ data to Standard Model 
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Final State Data Background a 



Final State 



M±T± 

b2j^ high-SpT^ 
Sjr-'- low-Sp2^ 

2jT±TT 



j2r=t 
.±2j 

2bj low-SpT^ 
jr"'" low-Sp2^ 
2b2j low-EpT 



2b7 

8j 

7j 

6j 

5j 

4j low-SpT^ 
4j27 

4jT± high-Epr 

4jT^ low-SpT^ 

4j;) high-Epj, 

4j7r± 

4j7J* 

4j7 

4jM*J* 
4jM*f ^ 
4jM± 

3j high-EpT 
3j low-Ep7^ 
3j27 

3jr* high-Epj, 
3j;i high-EpT 

3j7r='= 

3j7J* 

3j7 



3jp 7 

3jM*M^ 

3jM± 

3b2j 

3bj 

3b 

2t± 

27ji 

27 

2j high-Epj, 
2j low-SpT^ 
2j2T± 
2j27?4 
2j27 

2jT± high-Epr 



Data 


Background 


cr 




690 


817.7 


± 9.2 


-2 


,7 


1371 


1217.6 


± 13.3 


+ 2. 


2 


63 


35.2 


± 2.8 


+ 1. 


7 


255 


327.2 


± 8.9 


-1 


,7 


574 


670.3 


± 8.6 


-1 


.5 


148 


199.8 


± 5.2 


-1 


,4 


36 


17.2 


± 1.7 


+ 1.4 


33 


62.1 


± 4.3 


-1 


,3 


741710 


764832 


± 6447.2 


-1 


,3 


105 


150.8 


± 6.3 


-1 


,2 


256946 


249148 


± 2201.5 


+ 1. 


2 


279 


352.5 


± 11.9 


-1 


,1 


1385 


1525.8 


± 15 


-1 


1 


108 


153.5 


± 6.8 


-1 




528 


613.5 


± 8.7 


-0.9 


523 


611 


± 12.1 


-0, 


,8 


108 


70.5 


± 7.9 


+0.1 


14 


13.1 


± 4.4 







103 


97.8 


± 12.2 







653 


659.7 


± 37.3 







3157 


3178.7 


± 67.1 







88546 


89096.6 


± 935.2 







14872 


14809.6 


± 186.3 







46 


46.4 


± 3.9 







29 


26.6 


± 1.7 







43 


63.1 


± 3.3 







1064 


1012 


± 62.9 







19 


10.8 


± 2 







62 


104.2 


± 22.4 







7962 


8271.2 


± 245.1 







574 


590.5 


± 13.6 







38 


48.4 


± 6.2 







1363 


1350.1 


± 37.7 







159926 


159143 


± 1061.9 







62681 


64213.1 


± 496 







151 


177.5 


± 7.1 







68 


76.9 


± 3 







1706 


1899.4 


± 77.6 







42 


36.2 


± 5.7 







39 


37.8 


± 3.6 







204 


249.8 


± 24.4 







24639 


24899.4 


± 372.4 







2884 


2971.5 


± 52.1 







10 


3.6 


± 1.9 







15 


7.9 


± 2.9 







175 


177.8 


± 16.2 







5032 


4989.5 


± 108.9 







23 


28.9 


± 4.7 







82 


82.6 


± 5.7 







67 


85.6 


± 7.7 







498 


512.7 


± 14.2 







128 


107.2 


± 6.9 







5548 


5562.8 


± 40.5 







190773 


190842 


± 781.2 







165984 


162530 


± 1581 







22 


40.6 


± 3.2 







11 


8 


± 2.4 







580 


581 


± 13.7 







96 


114.6 


± 3.3 








Final State 


Data 


Background 




jn±pTj( 

j^^^[^7 


32 


32.2 


± 


10.9 





2jji high-SpT 


87 


80 


.9 


± 


6.8 





14 


11.5 


± 


2.6 





2j^ low-EpT^ 


114 


79 


.5 


± 


100.8 





4852 


4271.2 


± 


185.4 





2j)*T± 


18 


13 


.2 


± 


2.2 





e 4j/5 


77689 


7fiQR7 5 




930.2 





2j7r± 
2j77i 


142 
908 


144 
980 


.6 
.3 


± 
± 


5.7 
63.7 






903 


830.6 




13.2 





2j7 


71364 


73021 


,4 




595.9 





e 


25 


29.2 




3.6 





+ T 


16 


19 


.3 




2.2 







15750 


16740.4 




390.5 





2jfi*ji 


17927 


18340, 


,6 


± 


201.9 







15 


21.1 




2.2 





2jf.±7?i 


31 


27 


.7 


± 


7.7 





e±3j;( 


4054 


4U I 1 .Z 




63.6 





2j/^^7 


57 


58, 


.2 


± 


13 





e=t3j7 


108 


79.3 


± 


5 





2in^ ^ »5 


11 


7 


.8 


± 


2.7 





e±3j 


60725 


60409.3 


± 


723.3 







956 


924 


.9 


± 


61.2 





e±27 


41 


34.2 


± 


2.6 





2iu± 


22461 


23111 


,4 


± 


366.6 





e±2jr± 


37 


47.2 


± 


2.2 





2e±i 


14 


13, 


.8 


± 


2.3 





e±2jTT 


109 


95.9 


± 


6.8 







20 


17 


.5 


± 


1.7 


Q 


e±2j;i 


25725 


25403.1 


± 


209.4 





2e± 


32 


49, 


.2 


± 


3.4 





e±2j7?( 


30 


31.8 


± 


4.8 





2b high-EpT^ 


666 


689 


± 


9.4 





e±2j7 


398 


342.8 


± 


15.7 





2b low-Sp7^ 


323 


313, 


.2 


± 


10.3 





e±2jMT?* 


22 


14.8 


± 


1.9 





2b3j low-SpT^ 


53 


57 


.4 


± 


6.5 





e±2jMT 


23 


15.8 


± 


2 





2b2j high-EpT 


718 


803 


.3 


± 


12.7 





e±r± 


437 


387 


± 


5.3 





2b2jj( high-Ep2' 


15 


21 


.8 


± 


2.8 





e±TT 


1333 


1266 


± 


12.3 





2b2j7 


32 


39 


.7 


± 


6.2 







109 


106.1 


± 


2.7 





2b2jA.±;i 

-1- 


14 


17 


.3 




1.9 







960826 


956579 


± 


3077.7 




22 


21 


.8 




2 





77^ 


497 


496.8 


± 


10.3 







11 


14 


.4 




2. 1 





6 7 


3578 


3589.9 


± 


24.1 





2bj high-SpT-- 
2bj^ higli-SpT^ 


891 
25 


967 
31 


.1 
.3 




13 2 
3.1 






31 


29.9 


± 


1.6 





2bj7 


71 


54 


.5 


± 


7.1 





109 


99.4 


± 


2.4 





2bi/7 ^ »4 


12 


10 


.7 




1.9 





e±M± 


45 


28.5 


± 


1.8 









27 


.3 




2.2 


^ 


e±nT 
e^j27 


350 


313 


± 


5.4 







72 


66 


.5 




2.9 





13 


16.1 


± 


3.9 





2he p 


22 


19 


.1 




2.2 





386 


418 


± 


18.9 





2be jp 


19 


19 


.4 




2.2 





e±jr± 


160 


162.8 




3.5 





2be=^j 


63 


63 




3.4 







48 


44.6 




3.3 





2be=^ 


96 


92, 


.1 


i 


4.1 





^± i^.^± 


11 


8.3 




1.5 





-j- m 


856 


872.5 


± 


19 





e J7P 


121431 


121023 


± 


747.6 







3793 


3770.7 




127.3 





159 


192.6 




10.9 





fl T~ 


381 


440.9 




7.3 





s J7 


1389 


1368.9 




38.9 





±-1 =F 


60 


75, 


.7 


i 


3.4 







42 


33 




2.9 







15 


12 




2 





± ■ ±^ 


16 


9.2 




1.9 







734290 


734296 


± 


4897.6 


1 


± ■ T 


62 


63.8 




3.2 







475 


469.8 


± 


12.5 





+ . + 


13 


8.2 


± 


2 







169 


198 


.5 


± 


8.2 







148 


159.1 


± 


7 





M*M^7 


83 


60 


± 


3.1 





e±eT3j 


717 


743.6 


± 


24.4 





M±pT 


25283 


25178.5 


± 


86.5 





e±eT2jji 


32 


41.4 


± 


5.6 





j27ji 


36 


30.4 


± 


4.2 





e*eT2j7 


10 


11.4 


± 


2.9 





j27 


1822 


1813.2 


± 


27.4 





e±eT2j 


3638 


3566.8 


± 


72 





jr* high-Epr 


52 


56.2 


± 


2.5 







18 


16.1 


± 


1.7 







203 


252.2 


± 


8.7 







822 


831.8 


± 


13.6 





]i> high-Epr 


4432 


4431 


.7 


± 


45.2 





7 


191 


221.9 


± 


5.1 





j7T* 


526 


476 


± 


9.3 







155 


170.8 


± 


12.4 







1882 


1791.9 


± 


72.3 





e*eTj7 


48 


45 


± 


3.9 





'm 


103319 


102124 


± 


570.6 















17903 


18258.2 


± 


204.4 







71 


98 


± 


3.9 







98901 


99086.9 


± 


147.8 







15 


12 


± 


2 





b6j 


51 


42.3 


± 


3.8 







26 


30.8 


± 


2.6 





b5j 


237 


192.5 


± 


7.1 





109081 


108323 


± 


707.7 





b4j high-Epr 


26 


23.4 


± 


2.6 







171 


171 


.1 


± 


31 





b4j low-EpT^ 


836 


821.7 


± 


15.9 





jAi*7 


152 


190 


± 


39.3 





b3j high-EpT^ 


12081 


12071 


± 


84.1 



















b3j low-SpT^ 


2974 


2873 


± 


31 






Table 4.3: A subset of the comparison between Tevatron Run II data and Standard 
Model prediction. 
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prediction. All events have been partitioned in exclusive final states. The number 
of events observed is compared to the number expected from the Standard Model, 
taking into account the uncertainty due to finite Monte Carlo statistics, and the trials 
factor due to examining 399 final states. The final states are ordered in decreasing 
discrepancy. 

No final state is found to have a population discrepancy that is considered signif- 
icant after accounting for the trials factor. The largest population discrepancy is a 
2.7(7 deficit (including trials factor) observed in final state 6e^|^. Fig. 4-1 summarizes 
in a histogram the distribution of discrepancies observed in final state populations. 
Qualitatively, shape discrepancies give us the same information we had in the first 
round of the analysis. 

Discrepant distributions are flagged using the Kolmogorov-Smirnov (KS) statis- 
tic. ^ Fig. 4-1 shows a histogram of the disagreement seen in all kinematic distri- 
butions. 19,650 distributions are considered in 2 fb~^, and 559 are found to have a 
significant disagreement. However, as in the first round with 1.0 fb^^, no indication 
of new physics is found amongst these discrepant distributions; all are attributed to 
the "3-jet effect" , difficulties with intrinsic /c^ or residual coarseness of the correction 
model. 

Evolution of the Vista Global Comparison since 1 fb~^ 

Table 4.4 displays the Vista final states which newly appeared in the present analysis. 
A large number involve b-jets; this is a result of changes in our offline event selection 
criteria, which now accept more events containing b-tagged jets (previously events 
with a leading b-jet with px < 200 were prescaled offline by a factor of 10; we also 
introduced a new tri-b offline selection). 

^Tlie KS statistic is defined in terms of the cumulative distributions of two populations. Given a 
particular distribution, such as the invariant mass mass(jl, j2) of the two jets in the le+2jlpmiss 
final state, the Standard Model prediction and the data are both normalized to unit integral, and 
the cumulative distributions are drawn. The maximal separation of the two cumulative distributions 
is the KS statistic, a number between and 1. This statistic can be translated into a probability for 
the data to have been pulled from the Standard Model distribution, with the translation depending 
only on the value of the statistic and the number of data events. This KS probability KSp can then 

be converted into units of standard deviations KSo- by solving J^^" -h=e~~ dx — KSp. 
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Final State Data Background a 



4b4j 
4b2j 
4b 

3j2r± 

3j277S 

3jn±T± 

3b4j 

3b3j 

3b2jj( 

3b2j 

3b7 

3bj/i 

3bj7 

3bjM='=J* 

3be±)( 



2j7jiT^ 
2b6j 



1 2.4 ± 1.5 

1 0.7 ± 1.1 

1 1.1 ± 1.5 

1 ± 1 

1 0.9 ± 1.3 

3 1.3 ± 1.3 

6 8.1 ± 1.8 

1 2 ± 1.6 
3 0.8 ± 1.2 

2 2.9 ± 1.5 
8 8.2 ± 2 
1 0.7 ± 1.2 

23 27.2 ± 4.8 

1 0.4 ± 1.2 

1 3 ± 1.6 

1 1.8 ± 1.5 

1 1.1 ± 1.2 

1 0.4 ± 1.2 

3 0.8 ± 1.1 
1 1.1 ± 1.3 

1 1.7 ± 1.4 

2 0.3 ± 1.2 



Final State 


Data Background 
















2 1.1 ± 


1 


.2 





Final State 


Data Background 




2bjr=^ 


1 0.8 ± 


1 


.2 





b6jji 


1 0.1 ± 


1 


,1 





^ "-^ J 


3 0.3 ± 


1 


.1 





b4j;i 400+ 


3 1.6 ± 


1 


,4 





2be^/*T^ 


1 0.2 ± 


1 


.1 





bSjji^T* 


1 0.1 ± 


1 







2be^M^/5 


1 2.2 ± 


1.3 





b2jT±TT 


1 0.1 ± 


1 


.1 





72t* 


2 0.1 ± 


1 


.1 





b2j(/±7 


1 0.9 ± 


1.3 





j27T^ 


2 1.8 ± 


1 


.4 





br±rT 


2 1.6 ± 


1 


,3 





j2^t^/i 


1 0.6 ± 


1 


.2 





bu±>iTT 


1 1.1 ± 


1 


,3 





i ff ^ 


1 0.1 ± 


1 


.1 





b^ 7 


1 0.7 ± 


1 


2 







1 0.1 ± 


1 


.1 





bn±nTji 


3 0.7 ± 


1 


,3 





e±4jT-^ 


2 3.1 ± 


1 


.2 





bjT±TT 


1 0.6 ± 


1 


,2 





e*4j;iTj( 


1 0.6 ± 


1 


.2 





bjn^rT 


1 0.5 ± 


1 


,2 





e^4jM^/* 


1 ± 


1 







be^ 3j7 


1 1.4 ± 


1 


,2 





e^4jM^ 


1 0.7 ± 


1 


.2 





be±3jM^?i 


1 0.8 ± 


1 


,2 







4 3 ± 


1 


.4 





be±27 


2 0.2 ± 


1 


.1 





7r ^ 


1 0.9 ± 


1 


.1 





be±2jTT 


2 1.6 ± 


1 


,2 





775t^ 


1 0.5 ± 


1 


.2 





be±2j;iTT 


2 0.9 ± 


1 


,2 







1 0.6 ± 


1 


.1 





be±2j7/( 


3 0.4 ± 


1 


,2 





e±j27j( 


1 0.2 ± 


1 


.1 





be±r± 


1 1.2 ± 


1 


,1 







1 0.8 ± 


1 


.1 





be^ 7/5 


3 2.7 ± 


1 


,5 





e±eT2jM±;i 


1 ± 


1 







be^ j/^T^ 


1 1.5 ± 


1 


,3 







1 0.2 ± 


1 


















Table 4.4: New Vista final states which appeared in the analysis of 2 fb 



There are also 11 final states which were populated in the 1.0 fb~^ analysis, but are 
not now: lble+3jltau- Ib3j2ph le+ le+le-lphltau+ le+3j2ph 1 j lpmiss2tau+ 
IjSph 2b2ph 3j lmu+lpmissltau+ 3j Ipmissltau+ltau- lble+3j Iphlpmiss These 
events were generally found to contain an object (usually a r or plug photon) which 
now fails our tighter identification requirements. 

A final reason for the increase of Vista final states from 344 in 1.0 fb~^ to 399, 
is that jet-tau final states have been divided into high-p^- and low-pj- states. 

The 3jr^ and 2jr^ final states remain among the 'top ten' most discrepant states, 
but their significance has decreased compared to the first round. The improvement 
in agreement was achieved after slight changes in modeling jets faking taus in events 
with large activity. Other final states from the first round's top ten now exhibit zero 
discrepancy (after accounting for the trials factor). We attribute this to a combination 
of general improvements in modeling and statistical fiuctutations. 



4.3 Sleuth 



Sleuth algorithm was not modified in the second round. 
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CDF Run II data 
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Figure 4-2: The most interesting final states identified by Sleuth in 2 fb ^. 
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Figure 4-3: Blue points: The V distribution observed in 1990 pb~^, with one entry 
for each of the 87 Sleuth final states with at least 3 data. There are 153 Sleuth 
final states with non-zero background and less than 3 data, which are assigned V = 1. 
Black histogram: The expected V distribution from all 240 Sleuth final states with 
non-zero background, if instead of actual data we use pseudo-data pulled from the 
expected YIpt distribution of each final state, and omit the final states where pseudo- 
data are less than 3 and therefore have V = 1. As explained in Sec. 3.3.1, footnote 5, 
the V of final states with expected population < 10 is not uniformly distributed. 
Of the 240 final states Sleuth considers in 1990 pb~\ 171 have Standard Model 
background of less than 10 events, which causes the expected V distribution to slightly 
favor smaller values. 



25 



(A 
0) 

CO 20 
(A 



15 



0) 
S3 

E 



10 



123 



4.3.1 Results 

The most interesting final states highlighted by Sleuth are shown in Fig. 4-2. The 
region chosen by SLEUTH is shown by the (blue) arrow, extending up to infinity. 
CDF Run II data are shown as filled circles; Standard Model prediction is shown as 
a histogram. Sleuth final state labels are in the upper left corner of each panel. 
The number at upper right in each panel is P, the fraction of hypothetical similar 
experiments in which something as interesting as the region shown would be seen in 
this final state. The inset in each panel shows an enlargement of the region selected 
by Sleuth, together with the number of events (SM) predicted by the Standard 
Model in this region, and the number of data events (d) observed in that region. 

The distribution of V for the final states considered by Sleuth in the CDF Run 
II data is shown in Fig. 4-3. 

In these CDF data. Sleuth finds V = 0.085. This is sufficiently far above the 
Sleuth discovery threshold of V< 0.001 that no discovery claim can be made on the 
basis of Sleuth for 2 fb"^ 

Study of Same-Sign Sleuth States 

The top Sleuth final states appear a common trend to involve same-sign leptons. 
We first consider the 2^^*^ and 3'''^ Sleuth final states, which both contain same-sign 
electron and muon, significant missing energy, and varying numbers of jets. The 
relevant Vista final states are: 



Final State 


data 


background 




31 


29.9 ± 1.6 




16 


9.2 ± 1.9 




6 


1.7 ± 1.2 


e'^3j/i+^ 





0.26 ± 0.07 



The primary backgrounds for all these final states are similar, although the relative 
proportions vary with the number of reconstructed jets. The three main backgrounds 
are: W{-^ /iz/)+jets, with a jet faking the electron; > /i+/i^)+jets, where 1 is not 
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reconstructed, creating missing energy, and a jet fakes the electron; and Ty7(+jets), 
where the photon fakes the electron. 

All these processes involve real muons - there is no significant Standard Model 
contribution to these final states from fake muons. Therefore we can discard any 
explanation for the excess in data which involves charge assignment to muons faked 
by jets. 

We can be confident that the charge-sign of a real muon is well-measured by 
the CDF tracking system. The curvature resolution of the chamber is ac = 3.6 x 
10~^ cm~^. The curvature corresponding to a track with momentum of 100 GeV/c is 
2. 1 X 10^^ cm~^ . The sign of the curvature of such a track, and hence the charge of such 
a particle, is thus typically determined with a significance of better than five standard 
deviations [81]. Vista supports this conclusion, since we reconstruct ~25,000 fi~^fi~ 
events but only a single fi~^fi~^ event (and even then, the fi^fi'^ invariant mass is 
~150 GeV, making it unlikely to be a Z decay with wrong charge- reconstruction). 

We can assume the muon charge is correct therefore, and focus on the electron. 
This is a fake electron from a jet. This fake rate is well-determined from the elec- 
tron-l-jet(s) events, and similarly the /c-factors for the boson-|-jets processes are well- 
determined from other final states. We expect the contribution from these processes 
to these particular final states to therefore be accurate. Indeed, the most populous 
state le+lmu+lpmiss is well described, and the mild excesses seen by Sleuth arise 
from the le+lj Imu+lpmiss and le+2j Imu+lpmiss final states. Examination of the 
kinematic distributions from thse final states yields nothing further (the electron ?7det 
distributions for these final states are shown in Figs. 4-4 and 4-5), so, following the 
above reasoning and given that the effect is not statistically very signficant, we as- 
cribe the presence of these two states towards the top of Sleuth's list as likely just 
due to a fluctuation. 

The 1*^* Sleuth final state le+lmu+ also has same sign electron and muon, but 
no missing energy, and or 1 jets. The potentially relevant Vista final states are: 
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Figure 4-4: Detector rj distribution for the electron in le+lmu+lpmiss. 
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Figure 4-5: Detector 77 distribution for the electron in le+1 j Imu+lpmiss. 
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Final State data background 
45 28.5 ± 1.8 
13 8.2 ± 2 
2 2.6 ± 1.6 
2 0.6 ± 1.2 

So only the data excess in e~^fi~^ needs any potential investigation for evidence of 
Standard Model background mismodeling. The largest background is from Z — > 
(/i"'"yU~)+jets, with one muon lost and a jet faking an electron. As explained earlier, 
this process is well-constrained and cannot explain the excess in data. 

The next largest background is Z ^ r^r^, with one r decaying to an electron 
and the other to a muon. As discussed above, we trust the muon charge, so the 
electron must be reconstructed with the wrong charge. For central electrons, this 
occurs at a rate on the order of 1 in 10~^, through electron bremstrahlung to a 
photon with an asymmetric conversion that half the time results in an opposite charge 
electron, and therefore is too small to play a role here. For plug electrons, however, 
the track charge has a false-reconstruction rate of order 10% [82]. Fig. 4-6 shows 
the ?7dct of the electron, and we indeed observe that the Z ^ rr contribution is 
almost entirely in the plug. However, Fig. 4-7, which shows electron ?7dct for the 
2e+ final state (dominated by real electrons from Z with phoenix track charge mis- 
assignment), demonstrates that this charge misidentification is quite well modeled 
- there is certainly no room for the factor of two increase that would be needed 
to explain the data excess. The only other large background is from QCD dijet 
events where both electron and muon are fakes. Both of these total fake rates are 
very well constrained from the electron+jets(s) and muon+jet(s) final states, so the 
only possible flexibility is in the charge assignment to the fakes, which would shift 
background events between the le+lmu+ and le+lmu- final states. However, with our 
current modeling, this process contributes an approximately equal number of expected 
events (~ 5) to each of these states. It is implausible to argue that the combination 
of QCD Feynman diagrams and faking mechanisms could be such as to significantly 
anti- correlate the fake electron and muon charge signs, so this cannot contribute to the 
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Figure 4-6: rj^et distribution for the electron in le+mu+ 

data excess. In conclusion, after examining the possibilities and reminding ourselves 
that the similar final states but with additional jets are actually well described, we 
have no explanation for this excess other than a statistical fluctuation. 

The 5**^ most discrepant state in SLEUTH is i~^T~^ . Since Sleuth combines elec- 
trons and muons, the relevant Vista final states are: 

Final State data background 
36 17.2 ± 1.7 
11 8.3 ± 1.5 
15 12 ± 2 
8 9.4 ±3.1 

One sees that the excess comes only from e"*"|^r+. This is actually among most 
discrepant final states in Vista, with a significance of 1.4cr after accounting for the 
trials factor. The primary background is W —>■ ez/+jet, where the jet ends up faking 
a r with the same charge as the electron. This is rarer than the other case where 
the fake r has opposite sign to the electron. However, we appear to be modeling this 
process quite well, because it equally applies in the case when the W decays to muon 
and neutrino, and ViSTA predicts those final states correctly. We believe the excess 
in e~^;^T^ is therefore likely just a fluctuation. 
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Figure 4-7: ?7det distribution for the electron in 2e+ 



In conclusion, although the top SLEUTH states all involve same-sign leptons, we 
find no explanation that can simultaneously account for all. More data would help 
us see to what extent this is mismodeling, and to what statistical fiuctuation. 



Evolution of the Top Sleuth Final States from 1 fb ^ 

The Ibb final state which was at the top of the list of Sleuth discrepancies has 
now gone down the list. The reason is that the region selected previously had been 
selected based on a relatively small excess in a particular region of "^pr- Doubling 
the data caused that region to exceed the upper limit of 10,000 events. This upper 
limit is designed to reject excesses found in regions of high statistics where even a 
small systematic error would cause Sleuth to give a large discrepancy. 

The discrepancy in the j;^ final state, which is dominated by cosmic events, has 
been corrected by the additional quality criteria cuts on the cosmic background. 

The 3'''^, 4*^ and 6**^ most discrepant Sleuth final states from the first round were 
same sign dilepton final states. These final states have become more discrepant in 
this round of the analysis as discussed in Sec. 4.3.1. 

The 5^^ most discrepant Sleuth final state from the first round of the analysis was 
the ^^r"*". Then, we a major background contribution was missing, W{-^ tu) + jets, 
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which has been added. 

The remaining discrepancies were all corrected either by improving the background 
modelling, or were simply fluctuations. 



4.3.2 Sensitivity 

For the 2 fb~^ analysis, we have performed an additional test of the sensitivity of 
Sleuth to Standard Model single top production. 

Final State Events Acceptance (%) 



Wjj 


5149 


5.1 


Wbb 


3231 


3.2 


W 


1977 


2.0 


W4:j 


298 


0.3 


Wbbjj 


219 


0.2 


bb^ 


128 


0.1 


n 


109 


0.1 


bb 


96 


0.1 




59 


0.1 


bbl3 


41 


0.0 



Table 4.5: Partitioning of events in Single Top into Sleuth final states. The most 
populous final states are shown. The offline selection filter accepts % of the pseudo- 
signal events. The acceptance is shown for each individual final state. 



cost 


Final state 


V 


3600 


Wbb 


=0.0009669 


4800 


Wbb 


=0.0003004 


3800 


Wbb 


=0.0002808 


3600 


Wjj 


=0.0008754 


3600 


Wbb 


=0.0002843 


3800 


Wbb 


=0.0007113 


5000 


Wbb 


=0.0007072 


3800 


Wbb 


=0.0003327 


5400 


Wbb 


=0.0003309 


2800 


Wbb 


=0.0004739 



Table 4.6: Summary of "discoveries" for single top. Cost is the number of pseudo- 
signal events required to obtain V < 0.001. The second column contains the final 
state in which the most interesting region is found at the point of discovery. The 
third column contains V at discovery. 
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Figure 4-8: (Left) The final state in which single top first appears, as it is before 
the addition of any pseudo signal. (Right) The same final state, after the addition of 
pseudo signal required for its discovery by Sleuth. For this discovery, 3600 pseudo 
signal events yields V = 0.0009669. 



This sensitivity test is performed by injecting 'signal' single top events into pseudo- 
data generated from the background. Single top events are obtained from the CDF 
Top Group Monte Carlo samples stopOO and stopOl (s-channel and t-channel pro- 
duction respectively), run through our standard event reconstruction. The acceptance 
for the signal events into Sleuth final states is shown in Table 4.5. 

Signal events are added to the pseudo-data in chunks, until Sleuth's discovery 
threshold of V< 0.001 is reached. To account for random fluctuations, ten such trials 
are performed and the final result is averaged from all trials. Table 4.6 summarizes 
the result of each trial. 

As expected. Sleuth's 'golden' final state for discovering single top is Wbb. The 
~ 4% acceptance into this final state is consistent with the numbers obtained for 
dedicated single top searches [83]. Note that due to the definition of final states in 
Sleuth, Wbb contains events with 2 or 3 jets, with at least 1 6-tag. This merges 
somewhat the standard single top separation into distinct 2-jet and 3-jet bins, and 
this is why the tt background contribution is relatively large. 

An example 'discovery' is illustrated in Fig. 4-8. This shows the combined back- 
ground prediction in the absence of signal, and the J^Pt distribution after adding 
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Figure 4-9: Relative YIpt distributions from single top signal and combined back- 
ground prediction. 

sufficient signal to trigger Sleuth's discovery threshold. Fig. 4-9 illustrates the YIpt 
distribution from single top signal relative to the combined background prediction. 

The result of this sensitivity test is that Sleuth would be expected to discover 
single top at the 5a level in 2 fb~^ if it had a cross-section of 5.9 ±1.1 pb. The 
Standard Model expected cross-section is 2.86 pb (combined s- and t-channel). A 
naive extrapolation therefore leads to an expected luminosity for Sleuth discovery 
of 2.0 X (5.9/2.86)2 = 8.5 ± 3.1 fb'^ 

This conclusion seems perhaps surprising given the effort devoted to sophisticated 
tools such as Matrix Elements and Neural Networks for dedicated single top searches. 
The apparent sensitivity of Sleuth stems from the fact that it treats the background 
as being absolutely fixed. Any addition is therefore considered pure signal, allowing 
'discovery' of single top with relatively few extra events. In practice this is unrealistic, 
since YIpt alone would find it hard to distinguish between single top production 
and excess W+heavj flavour relative to Alpgen predictions, which have a large 
uncertainty. In a realistic test, we would probably have to introduce a separate k- 
factor for l^+heavy flavour, which would swallow up much of the single top signal, 
since there is no other populous final state that could constrain the W+heavy flavor k- 
factor independently of possible single top contributions. For the dedicated single top 
searches, the total backgrounds are generally allowed to float, and more sophisticated 
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purely 'shape-based' variables are used to discriminate signal from background. 



4.4 Bump Hunter 

The bump hunter is a new feature added in the second round of this analysis, to 
enhance the sensitivity of the search to new physics involving narrow mass resonances. 

4.4.1 Strategy 

The idea is to scan the spectrum of most mass variables with a sliding window. 
The window needs to vary in width to follow the changing detector resolution. As 
the window drifts accross a mass distribution, it evaluates the probability that the 
amount of data therein, or even more, could have emerged by fluctuation from the 
predicted population. The window where this probability is smallest contains the 
most interesting local excess of data. 

In each final state there are typically several mass variables to scan. On average 
there are 5036/399 ~ 13. They include masses of all combinations of reconstructed 
objects, such as pairs, triplets, or bigger ensembles. 

The width of the sliding window equals two times the characteristic mass res- 
olution for the given combination of objects and at the given mass value. Mass 
uncertainty results from uncertainties about the specific energies and momenta of 
all objects involved. It is possible to have combinations of four-momenta that re- 
sult in the same mass, but different mass uncertainties. For example, if a de- 
cays to e~^e~, the mass of that pair will always be close to the nominal mz — 91 
GeV, though its resolution will depend on the boost of the decaying Z^. Obvi- 
ously, each event has a different mass uncertainty, so we need to estimate the char- 
acteristic mass resolution at each value of mass and for each mass variable. That 
characteristic mass resolution will be representative of the mass resolution of the 
events there. To estimate it, we assume that all objects in the ensemble have equal 
momentum, negligible mass, and their momenta balance on a plane^. Then, we 
^If the (equal) momenta are two, to balance they have to be back-to-back. If they are three, they 
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assign to each involved individual energy the appropriate uncertainty, depending 
on what object it belongs to, since different objects are measured with different 
energy resolutions. For electrons and photons, the uncertainty is assumed to be 
Ai^EM = 0.14V^ + O-OIS-E, determined by the electromagnetic calorimeter. For jets 
and rs it is taken to be AEhad = ^^0.457/^ + 20.3/^2 + 0.00834, determined by 
the hadronic calorimeter. For (beam constrained) muons it is AE^ = 0.0005E^, 
determined by the COT track curvature resolution. In cases of transverse mass 
involving ^, we assume roughly AEmet = 'i\fi>- We propagate those Ai^s corre- 
sponding to the members of the ensemble into the system's total mass. For example, 
if we want to find the characteristic mass resolution for a (e"'",/i~,j) triplet at sys- 
tem mass 90 GeV, we have m = {E^ + + Ej)"^ — (p^ + + Pj)"^- We assume 
Eg. = E^ = Ej = E and the planar configuration with zero net momentum, to obtain 
that m = 3E, hence E = 30 GeV for each object. We use the above formulas for the 
three different AEs, keeping in mind the different resolutions for electrons, muons 
and jets, and then we propagate those uncorrelated uncertainties to the mass, to find 
Am = ^/{AE,y + {AE^y + {AEj)^ = 6.57 GeV. 

The step size by which the window drifts equals half a characteristic mass reso- 
lution, therefore it varies along the mass spectrum, as the width does too. That way 
there are no gaps left between consecutive windows. Instead, consecutive windows 
partly overlap. 

Each window comes with two sidebands, extending on each side as far as the 
window's width. The region of the spectrum that is scanned is slightly narrower than 
the whole spectrum's span (defined as the interval between the highest-mass and the 
lowest-mass event in both data and background), so that all considered windows have 
sidebands lying within the spectrum. 

As the window drifts along a mass spectrum, its p-val is calculated at each location. 

That is defined as the Poisson probability that the Standard Model events expected 

have to be on the same plane, each separated by 120° from its first neighbors. If we have N > A 
equal, balancing momenta in 3 dimensions, then their angular configurations can be significantly 
more complicated, as there are many possible arrangements that satisfy the condition of ballance. 
To avoid such complexity, we choose to constrain all N vectors in one plane, and assume the solution 
where all vectors have separation ^ from their nearest neighbors. 
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in the window (6) would fluctuate up to or above the observed data (d), i.e. p-val = 

i=d i\ ^ ■ 

A window qualifies as a bump if it satisfies the following criteria: 

• The central region must contain at least 5 data events. 

• Both sidebands must be less discrepant than the central region, i.e. both must 
have larger p-val. 

• If the background in a sideband is non-zero, then it must have p-val > 

namely it must not exhibit a significant (5cr) discrepancy. If the background is 
zero, then it must have less than 5 data^. 

• The above criteria need to hold even when we consider the possible effects of 
low Monte Carlo statistics in the background. This is explained next. 

It can happen to have a great excess of data in the central window, and simultane- 
ously non-discrepant sidebands, but realize that the sidebands contain only a couple 
of very large-weight events in the Standard Model background. These large-weight 
events are called "spikes" , and are the result of limited Monte Carlo statistics. That 
bump would potentially pass all quality criteria, and appear to be statistically sig- 
nificant, but it would be prudent to treat conservatively the presence of spikes in the 
sidebands, and consider that these Monte Carlo events could easily have been in the 
central window instead. In that case, the p-val of the central window would be larger 
(less significant) and the sidebands would have a higher probability to be discrepant, 
hence the bump could disqualify. Since limited Monte Carlo statistics are a practical 
limitation, we have to be conservative and eliminate, if necessary, this bump. To 
do that, we first need to define what we consider as a spike in each sideband, and 
reevaluate the p-val and the quality of the bump, assuming the spikes from both 
sidebands transfered into the central window. To define the weight of spikes in a 

"^This special treatment of the zero-background case is to be able to spot excesses of data that 
may be isolated at mass values where there is no Standard Model background at all. If we had, for 
example, 6 events in the central window and 1 event in the sideband, we wouldn't like this band to 
disqualify due to having a discrepant sideband. 
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sideband, we look for outliers among the Monte Carlo events, namely for events with 
significantly larger weights than the average weight of the events in the sideband. 
We find the average weight and the standard deviation of weights in the sideband, 
including in the calculation all Monte Carlo events therein. If there is an event whose 
weight lies beyond 3 standard deviations from the average, then we gradually reduce 
its weight. As we reduce it, we reevaluate the average and standard deviation of 
weights. If along its path towards smaller weight it meets another event of same 
weight, then their weights are bound to be equal from then on, and keep being grad- 
ually reduced together. To visualize this process, imagine the axis of weights as an 
horizontal stretched string, and the weight of each event represented by the position 
of a tiny bead along this string; the larger the weight, the farther on the right the 
bead is located. If there are significant outliers, namely beads very far on the right, 
we start pushing the rightmost bead slowly from right to left, to bring it closer to the 
others. On its way, the rightmost bead drags with it any beads it meets, since beads 
can not pass through each other. We stop this reduction of weights when they are all 
within 3 standard deviations from their average. Then, we compare the total initial 
weight to the total final weight in the sideband. The difference is weight attributed 
to spikes. If this difference turns out to be smaller than the largest single weight in 
the sideband, then we define the latter as spike instead. For the sake of saving time, 
we do not apply the anti-spike treatment described above, unless the p-value of a 
qualifying bump candidate is smaller than -^=e~^'^l'^ dx, since it is not crucial to 
be conservative, when a bump is not significant to begin with. A demonstration of 
the effect of the anti-spike treatment is shown in Fig. 4-10. 

When a variable's spectrum is scanned from one end to the other, the qualify- 
ing bump with the smallest p-val is the most interesting within that variable. Its 
statistical significance is quantified on first level by its p-val; the smaller the more sig- 
nificant. It is crucial, though, to account for the trials factor due to examining many 
windows within that spectrum. We need, therefore, to estimate the probability that a 
qualifying bump candidate of such a small (or smaller) p-val would appear anywhere 
along the spectrum, if instead of the actual data we had pseudo-data pulled from the 
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Figure 4-10: (Left) The p-val of each bump candidate, as a function of the loca- 
tion of each window's center, along mass(jl,j2) in final state 2j < 400 GeV. 
Bump candidates failing quality criteria have p-val=l. The most significant bump 
has p-val ~ 10~^, which translates to ~ 3 x 10"^ and ~ 0.15, therefore all 
local excesses are insignificant. (Right) For demonstration, we apply the conservative 
anti-spike treatment to all bump candidates. The result of anti-spike treatment is to 
have larger p-values and the reduction of significance is greater in regions like around 
400 GeV, where Monte Carlo statistics are poorer, therefore spikes contribute more. 



Standard Model distribution. We denote this probability Pa, and it can be estimated 
either experimentally (by producing many sets of pseudo-data and scanning them for 
more interesting bumps), or using a semi-analytic calculation. 

The semi-analytic method, whose goal is to save the enormous time-cost of using 
Monte Carlo to experimentally evaluate Pa for all mass variables, proceeds as follows: 
For each window and its sidebands, we estimate with Monte Carlo the probability that 
it would satisfy quality criteria {P{Q)), if the data populations in the center and in 
the sidebands were pulled randomly from the respective expected populations therein. 
Let's denote the p-val of the most interesting bump in the actual data p-valmin- Denote 
the probability that a window would have p-val < p-valmin as P{S). The probability 
that a window would qualify and simultaneously have p-val < p-valmin is P{Q A S) ~ 
P{Q) P^S), where we assumed that Q and 5* are independent. This is not generally 
true, but holds approximately in most cases. In fact PiQ\S) = > P(g), 

because if S is true then we have a significant excess of data in the central window, 
which makes it somewhat less likely for the sidebands to exhibit a bigger discrepancy 
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than the center, hence it's more hkely that quahty standards (Q) will also be met. So, 
P(Q) P{S) < P{Q\S) P{S) = P{Q A S), i.e. we slightly underestimate P{Q A S) by 
approximating it with P{Q) P{S). P{S) is approximately pt^a/min, but that is exactly 
correct only as long as there is an integer immbei of data that, given the background 
in the window, would result in a p-val of exactly p-valmin- If that is not the case, 
then P{S) < pval^i^, because to exceed in significance the most interesting bump, 
this window would need to exhibit a p-val not just equal to p-vsA^i^, but smaller. For 
example, if p-valmin = 0.01 and the background is b = 3.2, then to exceed p-val^i^ in 
significance we need the data to be at least d = 9. li d = 8 then p-vaA = 0.016 > 0.01. 
However, if d = 9 then p-val = 0.0057, which means that the true P{S) in this 
example would be 0.0057 instead of 0.01. This difference becomes negligible for large 
backgrounds, where one event more or less changes p-val negligibly. 

We find, as described, P{Q A S) for all windows considered along the spectrum, 
and set Pa = 1 — YI{1 — P{Q A S)), namely the probability that at least one window 
would qualify and surpass in significance the most interesting bump in the actual 
data. Here, another assumption is implicit: that windows are independent. 

A comparison between the semi-analytic (fast) and the experimental method to 
estimate Pa is shown in Fig. 4-11. Pseudo-data were pulled from all mass distribu- 
tions, and then both the slow and the fast methods were used to estimate Pa- The 
comparison shows that, for pseudo-data, the fast method returns a Pa which is, when 
translated into units of standard deviation, within about la from the Pa determined 
by the slow method. This difference reflects on the expected distributions of Pa from 
all mass variables when using the two methods. While the slow method returns a Pa 
with uniform expected distribution, the fast method's Pa is distributed as shown in 
Fig. 4-12. 

The slow method does not rely on any approximation, therefore its answer is more 
representative of the true Pa- It is only limited by the number of pseudo-data sets that 
we can generate. Its disadvantage is that even when applied on just one mass variable 
to estimate the significance of its most interesting bump, it can take prohibitively 
long. How long depends on the number of expected events in the final state where 
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the mass variable belongs, but more importantly on the smallness of p-valmin. For 
really significant bumps (p-valmin ~ 10"*^) it may take millions of sets of pseudo-data 
to start resolving Pa experimentally. The slow method returns the best estimate of 
Pa it could obtain within the amount of time it was allowed to run. If during this 
amount of time it is clear at 95% confidence level that Pa is either greater or smaller 
than what corresponds to a 5cr effect {j^ --i=e~^^/^ dx = 2.87 x 10"''), then the slow 
method returns the estimated value of Pa at that time, since the conclusion is clear 
and additional accuracy would be of no use. Due to its great time cost, we employ 
the slow method only if the fast (semi-analytic) method has returned a significant 
enough Pa, i.e. smaller than what corresponds to a 4.5cr effect. The final significance 
of a bump is not quantified by Pa, but by Ph (defined later), which includes the whole 
trials factor. For Pa equivalent to 4.5a, Ph is 2.1a, safely away from the discovery 
threshold of 3a in Pf,, which corresponds to Pa of 5a. This is mentioned to explain 
that the slow and more accurate estimator for Pa is employed not just beyond the 
discovery threshold, but safely earlier, when a bump starts being mildly significant. 

Since Pa encompasses the trials factor from examining multiple windows within 
the mass variable, it characterizes the significance of the mass viariable in terms of 
its most interesting bump. The next question is what the probability is that in a 
pseudo-experiment, where data are pulled from the Standard Model epxectation, any 
mass variable would appear with a Pa smaller than the actual Pa of the mass variable. 
We denote this probability as Pf,. We estimate it assuming all mass viariables are 
statistically independent trials, therefore P;, = 1 — (1 — Pa)'^ , where is the total 
population of scanned mass variables from all Vista final states. 

In summary, for each mass variable the most interesting bump is the one with the 
smallest p-val, and with all trials factor accounted for, its significance is approximately 
given by P^. Then P^ is converted to units of standard deviations, and if it corresponds 
to a 3a effect or more, then we consider it a discrepancy worth pursuing. 
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Figure 4-11: Comparison of fast versus slow method to estimate Pa- Each point 
corresponds to a mass variable with at least one qualifying bump in pseudo-data. 
The three lines indicate the locus where the fast estimate of Pa is equal to, or ilcr 
away from the slow estimate of Pa- Slow Pa can be only a rational number, since it is 
the fraction of two integers, namely the number of pseudo-data distributions with a 
more interesting bump and the total number of tried pseudo-data distributions. That 
is why the slow Pa appears to assume a discrete spectrum of values. 




Figure 4-12: Expected distribution of the fast and the slow estimator of P^, when 
applied on pseudo-data. The slow estimator (lejt) is distributed according to a normal 
distribution (except for some recurrent values which reflect that the slow estimator can 
only be a rational number), while the fast one (right) follows a Gaussian probability 
density function with mean 0.2204 and standard deviation 1.453. In the right plot, 
the Normal distribution has been drawn for comparison. 
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4.4.2 Results 



The summary of the most interesting bump in each mass variable is shown in Fig. 4- 
13. 

The only mass variable with its most significant bump exceeding the discovery 
threshold is the mass of all four jets in the final states with four jets of ^p^- < 400 
GeV, shown in Fig. 4-14. This is attributed to the "3-jet" effect, the main cause of 
all shape discrepancies in this analysis. Fig. 4-15 shows another instance of the same 
effect in that final state. The same effect is observed in final states of different jet 
multiplicity, as shown in Fig. 4-16. 

Although no discovery-level bumps were found in other mass variables, it is inter- 
esting to present the most interesting bumps found in some mass distributions. 

In the mass of the (e"*", e~) pair in the final state with two opposite sign electrons 
(e"'"e~) the most significant bump corresponds to a 2.7cr effect, which is though exactly 
at the Z boson resonance. The number of expected events there is so high, that even 
the slightest systematic mismodeling would appear as very statistically significant. 
From Fig. 4-17 it is clear that this "bump" is not due to new physics, but a tiny 
systematic mismodelling of the Z-peak, with no visible effect anywhere else. 

The mass of the two muons in the [i^ [i' final state does not have any significant 
bump either, not even of the mundane kind found in e^e~ . That is shown in Fig. 4-18. 

Another potentially interesting mass variable is the dijet mass in the final state 
with two high XIpt j^ts. That is shown in Fig. 4-19. Unfortunately, no high-mass 
di-jet resonance was observed. 

4.4.3 Sensitivity 

To test the sensitivity of the Bump Hunter, we generate some specific new physics 
signal, pass it through the full CDF detector simulation, and inject it gradually on top 
of pseudo-data pulled from the Standard Model background, until the Bump Hunter 
identifies a discovery-level bump. 
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CDF Run II Preliminary (2 fb'^) 

: Mass distributions: 5036 




Figure 4-13: Significance of tlie most interesting bump in eacli mass variable. Eacli 
entry corresponds to one mass distribution found to contain at least one bump satisfy- 
ing quality criteria. The quantity distributed is P^, transformed to units of standard 
deviation (a) , using the formula Pa = ^"^ ^x. Large Pa translates to a small 

number of a and signifies an insignificant effect. The discovery threshold corresponds 
to 5a. The entries under 4. Sex have been estimated using the semi-analytic (fast) 
method, which yields values distributed according to the black curve when applied 
on pseudo-data agreeing with the Standard Model background. Values above 4.5(T are 
estimated using the slow, more accurate method. Therefore, values of Pa correspond- 
ing to more than 4.5(7 can be translated directly to significance, since their expected 
values follow the Normal distribution. About 5000 mass distributions are considered, 
which means that to have an effect of significance 3a after trials factor, it needs to 
have a significance of 5a or more in this scale of Pa. Only one mass distribution has 
its most significant bump exceed this discovery threshold. More details in the text. 
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Figure 4-14: The most significant bump found in tlie 4j < 400 GeV final state, 

indicated by the blue lines. Its Pb translates to 4. la. 



120 GeV Higgs in association with W 

The pseudo-signal use for this test contains a Standard Model Higgs of mass 120 GeV, 
allowed to decay to bb, which has branching ratio 68% [84] . The associated W decays 
to e or yU or r plus neutrino, with total branching ratio ~ |. 

About 6500 signal events are required to obtain the first bump beyond discovery 
threshold. Events passing selection criteria are distributed in several final states, 
and 15 of them make it to the 2be^f) final state, producing the bump in Fig. 4- 
20. Compensating for the branching ratio, we find that the required cross section of 
Vrifi20GcV to have this 5(T level discovery would be about 14.4 pb, which is ~90 times 
larger than the predicted Standard Model cross section. 



Z' at mass 250 GeV 

Pseudo-signal of a 250 GeV Z' boson was generated, where Z' may decay to 
where £ can be e, /x, or r. The first discovery-level bump caused after injecting about 
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Figure 4-15: (Upper) The "3-jet" effect appearing in the angular separation between 
the second (j2) and the third (j3) leading jets, in final state 4j < 400 GeV. 

There is an excess of soft final state radiated jets emitted at small angles. The lower 
two distributions from the same final state demonstrate exactly this excess, which is 
not present in the pt of the first and second leading jets. 
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Figure 4-16: The "3-jet" effect appearing in the mass of all jets in the final state with 
three (left) and five (right) jets. The excess is similar to the one identified as a bump 
in the 4j J2pt < 400 GeV final state. The difference in the case of 3j < 400 

GeV is that the excess is wide and the sidebands are discrepant, making this bump 
candidate disqualify, while in the case of 5j the excess satisfies bump quality criteria, 
but has Pb corresponding to only 1.5a. 



700 events of this pseudo-signal. 55 events end up in the le-|-le- final state, and form 
the bump shown in Fig. 4-21. 

With 700 injected events the significance found is 3.7a, which is higher than the 
discovery threshold of 3a. That is because the pseudo-signal is injected in bunches of 
100 events, so the actual requirement is between 600 and 700 events. Dividing this 
number of generated events by our integrated luminosity shows that we would need 
the cross section times branching ratio of this signal to be approximately 0.325 pb. 



Z' tt at mass 500 GeV 

For this test we generated Z' events of mass 500 GeV, where the heavy boson decays 
to a tt pair. Injecting 5000 such events causes simultaneously two significant bumps 
in the be~^3j^ final state; one is in the transverse mass between ^ and the second 
highest pt jet (j2), with significance 3a] the other is in the transverse mass of the 
third highest pt (j3) and with significance 3.2a. The latter is shown in Fig. 4-22. 

In another instance, after injecting 4600 different pseudo-signal events, a 3.3a 
effect after trials factor was created in the same final state {be'^3j^), but this time in 
the variable rritt, where one would more easily interpret the excess as due to resonant 
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Figure 4-17: (Upper two) The most interesting bump found in final state e+e~. (Bot- 
tom) The p-val of all bumps accross the mass spectrum of the two leptons. Apart 
from this discrepancy at the Z-peak, which corresponds to a 2.7a effect after trials 
factor and reflects only a tiny mismodeling in a region with very high statistics, no 
other significant bump was found. 
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Figure 4-18: (Upper two) The most interesting bump found in final state /.i^/x". 
(Bottom) The p-val of all bumps accross the mass spectrum of the two leptons. Even 
the most significant bump, at the Z-peak, has Pf, — 0.74, therefore is completely 
insignificant. 
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Figure 4-19: (Upper two) The most interesting bump found in final state 2j YIPt > 
400 GeV. (Bottom) The p-val of all bumps accross the di-jet mass spectrum. Even 
the most significant bump, yields Pb — 0.99, therefore is completely insignificant. 
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Figure 4-20: Example of a pseudo-discovery of the Standard Model Higgs boson 
{rriH = 120 GeV), produced in association with a W boson. Out of 7000 generated 
WH{-^ iubb) events, 15 populate the 2be~^'^ final state. They cause this local excess 
which is identified by the Bump Hunter algorithm and its significance is estimated a 
3.4(7 after trials factor. 
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Figure 4-21: Example of a pseudo-discovery of a 250 GeV Z' decaying to charged 
leptons. Out of 700 generated events, 55 populate the e^e~ final state, where the 
most significant bump appears. The significance of this bump is estimated at 3.7(T 
after trials factor. 
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Figure 4-22: (Left) Most significant bump after injecting 5000 ^soogeF ~^ events. 47 
signal events make it to the be'^Sj'^ final state, which cause this bump of significance 
3.2a after trials factor in transverse mass of the third highest px jet and (Right) 
Most significant bump after injecting a different 4600 ^soocev ~^ events, on a 
background that was allowed to fluctuate anew. 41 signal events make it to the 
he^'ijf) final state, which cause this bump of significance 3.3cr after trials factor in 
mtj, consistent naturally with the mass of the introduced Z' . 

production of ti. That is shown in Fig. 4-22 as well. 

With discovery cost of approximately 4800 events, the required cross section is 
approximately 2.4 pb. 



4.5 Summary of second round with 2 fb 

Vista and Sleuth search for outliers, representing significant discrepancies between 
data and Standard Model prediction. Unfortunately, the result obtained is that no 
signficant outliers have been found either in the total number of events in the Vista 
exclusive final states, or in Sleuth's search of the tails. Disregarding effects 

from tuning corrections to the data. Sleuth's V provides a rigorous statistical calcu- 
lation of the likelihood that the most discrepant Sleuth final state seen would have 
arisen purely by chance from the Standard Model prediction and correction model 
constructed within ViSTA. 

Vista's correction model does not explicitly include some sources of systematic 
uncertainty, including those associated with parton distribution functions and shower- 
ing parameters in the event generators used; these sources of uncertainty are included 
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implicitly, in that they would be considered if necessary in the event of a possible dis- 
covery. Other uncertainties related to the modeling of the CDF detector response and 
object identification criteria are determined as part of Vista but are not included in 
the calculation of V. For the correction model used, Sleuth finds V = 0.085. 

The Bump Hunter, a new algorithm for identification of mass resonances, did not 
find any significant mass bumps either, except for one that is attributed to Pythia 
not modeling perfectly parton showering. 

Although the Vista correction model could presumably be improved further to 
show even better agreement with Standard Model prediction, finding V ^ 0.001 in- 
dicates that even the most discrepant XI ^^il is not of statistical interest. The 
correction model used is thus good enough (even without considering effect of sys- 
tematic uncertainties on the SLEUTH final states) to conclude this search for outliers 
using Vista and Sleuth in 2 fb~^. 

This analysis does not prove that there is no new hint of physics buried in these 
data; merely that this search does not find any. 
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Chapter 5 

Grand Summary and Conclusion 



This thesis presents the first model-independent search for new physics of such scope. 

The Standard Model was implemented using a simplified set of corrections. 

New physics was sought that would cause significant discrepancies in (a) popula- 
tions of exclusive final states, (b) shapes of kinematic distributions, (c) mass spectra, 
and (d) high-^pj- events. 

The search was first conducted in 1 fb^^ of CDF II data, revealing no ground on 
which to support a discovery claim. It was then repeated in 2 fb^^ of data, improved 
and enhanced with the Bump Hunter, an algorithm to locate narrow resonances due 
to new massive particles. 

Unfortunately and surprisingly, even with 2 fb~^ the result was null, in the sense 
that no new physics could be claimed with the findings. The discrepancies seen were 
attributed mainly to the difficulty in modeling soft radiated parton showers with 
Pythia. This issue was suspected to be problematic, but no other analysis had 
illustrated so clearly its repercussion. 

Although no single analysis can guarantee that new physics is nowhere in the data, 
it is highly informative that in a search of this scope nothing exploitable was found. 
This is complemented and consented by the numerous searches, dedicated to specific 
signals, which so far have failed too to reveal what lies beyond the Standard Model. 

Even with a null result, the value of this technique is great in providing an overview 
of all data, even those nobody ever considers. It can make a big difference at the later 
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stages of the LHC, or in any experiment where there is a prohferation of data, and 
a fairly accurate theoretical prediction analogous to what our event generators and 
detector simulation provide. 
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Appendix A 

Correction Model Details 



Some aspects of the correction model are fixed, rather than dynamically adjusted by 
the global fit, which is viewed as just a tool to provide reasonable values for some 
parameters of the correction model. Not every parameter needs to be determined by 
a fit, as long as it is reasonable or estimated beforehand, through a MC study for 
instance. 

Implementation details of the correction model will be described in this chapter 
in some extra detail. 

A.l Fake rate physics 

The following facts begin to build a unified understanding of fake rates for electrons, 
muons, taus, and photons. This understanding is woven throughout the correction 
model, and significantly informs and constrains the Vista correction process. Explicit 
constraints derived from these studies are provided in Appendix A. 3. The underlying 
physical mechanisms for these fakes lead to simple and well justified relations among 
them. 

Table A.l shows the response of the CDF detector simulation, reconstruction, and 
object identification algorithms to single particles. Using a single particle gun, 10^ 
particles of each type shown at the left of the table are shot with pt = 25 GeV into the 
CDF detector, uniformly distributed in 9 and in 0. The resulting reconstructed object 
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Table A.l: Central single particle misidentification matrix. Using a single particle 
gun, 10^ particles of each type shown at the left of the table are shot with px = 
25 GeV into the central CDF detector, uniformly distributed in 6 and in 0. The 
resulting reconstructed object types are shown at the top of the table, labeling the 
table columns. Thus the rightmost element of this matrix in the fourth row from the 
bottom shows p{t~ — * b), the number of negatively charged tau leptons (out of 10^) 
reconstructed as a 6-tagged jet. 
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Figure A-1: Transverse momen- 
tum distribution of reconstructed 
objects (labeling columns) arising 
from single particles (labeling rows) 
with pt = 25 GeV shot from a 
single particle gun into the cen- 
tral CDF detector. The area under 
each histogram is equal to the num- 
ber of events in the corresponding 
misidentification matrix element of 
Table A.l, with the vertical axis of 
each histogram scaled to the peak of 
each distribution. A different verti- 
cal scale is used for each histogram, 
and histograms with fewer than ten 
events are not shown. The horizon- 
tal axis ranges from to 50 GeV. 



157 



types are shown at the top of the table, labehng the columns. The first four entries on 
the diagonal at upper left show the efficiency for reconstructing electrons and muons ^. 
The fraction of electrons misidentified as photons (top row, seventh column) is seen 
to be roughly equal to the fraction of photons identified as electrons or positrons 
(fifth row, first and second columns), and measures the number of radiation lengths 
in the innermost regions of the CDF tracker. The fraction of B mesons identified 
as electrons or muons, primarily through semileptonic decay, are shown in the four 
left columns, eleventh through fourteenth rows. Other entries provide similarly useful 
information, most easily comprehensible from simple physics. 

The transverse momenta of the objects reconstructed from single particles are 
displayed in Fig. A.l. The relative resolutions for the measurement of electron 
and muon momenta are shown in the first four histograms on the diagonal at upper 
left. The histograms in the left column, sixth through eighth rows, show that single 
neutral pions misreconstructed as electrons have their momenta well measured, while 
single charged pions misreconstructed as electrons have their momenta systematically 
undermeasured, as discussed below. The histogram in the top row, second column 
from the right, shows that electrons misreconstructed as jets have their energies sys- 
tematically overmeasured. Other histograms in Fig. A.l contain similarly relevant 
information, easily overlooked without the benefit of this study, but understandable 
from basic physics considerations once the effect has been brought to attention. 

Here and below p{q —>■ X) denotes a quark fragmenting to X carrying nearly all 
of the parent quark's energy, and p{j X) denotes a parent quark or gluon being 
misreconstructed in the detector as X. 



^Thc electron and muon efRciencies shown in this table are different from the correction factors 
0025 and 0027 in Table 4.2, which show the ratio of the object efficiencies in the data to the object 
identification efficiencies in CDFsiM. 
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Figure A-2: A few of the most discrepant distributions in the final states ej and j/i, 
which are greatly affected by the fake rates p{j e) and p{j — ^ /x), respectively. 
These distributions are among the 13 significantly discrepant distributions identified 
as resulting from coarseness of the correction model employed. 
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Figure A-3: A few of the most discrepant distributions in the final states jr and j'y, 
which are greatly affected by the fake rates p{j r) and p{j — » 7), respectively. The 
distributions in the j'y final state are among the 13 significantly discrepant distribu- 
tions identified as resulting from coarseness of the correction model employed. 
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The probability for a light quark jet to be misreconstructed as an e"*" can be written 

pU e+) = p{q 7) p(7 e+) + 
p{q — > 7r°) p{n^ — * e"*") + 
p{q 7r+) p{n~^ e"*") + 

p{q K+) p{K+ e+). (A.l) 

A similar equation holds for a light quark jet faking an e~. 

The probability for a light quark jet to be misreconstructed as a /i"*" can be written 

p(g ^ ^ (A.2) 

Here p(7r — > /i) denotes pion decay- in- flight, and p{K — > /i) denotes kaon decay-in- 
flight; other processes contribute negligibly. A similar equation holds for a light quark 
jet faking a 

The only non-negligible underlying physical mechanisms for a jet to fake a photon 
are for the parent quark or gluon to fragment into a photon or a neutral pion, carrying 
nearly all the energy of the parent quark or gluon. Thus 

P(j ^ 7) = P{(1 p{tx^ ^ 7) + 

p(g ^ 7)p(7 7). (A.3) 

Up and down quarks and gluons fragment nearly equally to each species of pion; 
hence 

i p{q ^ tt) = p{q 71+) = p{q vr") 

= p(g^7rO), (A.4) 

where p{q vr) denotes fragmentation into any pion carrying nearly all of the par- 
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ent quark's energy. Fragmentation into each type of kaon also occurs with equal 
probability; hence 

^p{q^K) =p{q ^ K+) = p{q ^ K-) 

= p{q^ =p{q^ K^), (A.5) 

where p{q K) denotes fragmentation into any kaon carrying nearly all of the parent 
quark's energy. 

Pythia contains a parameter that sets the number of string fragmentation kaons 
relative to the number of fragmentation pions. The default value of this parameter, 
which has been tuned to LEP I data, is 0.3; for every 1 up quark and every 1 down 
quark, 0.3 strange quarks are produced. Strange particles are produced perturbatively 
in the hard interaction itself, and in perturbative radiation, at a ratio larger than 
0.3:1:1. This leads to the inequality 

0.3 < ?^f^ < 1. (A.6) 

p{q tt) 

where p{q —>■ K) and p{q — > vr) are as defined above. 

The probability for a jet to be misreconstructed as a tau lepton can be written 

P{] ^ r^) = P{J ^ T+) + p{j ^ T+), (A.7) 

where p{j r^) denotes the probability for a jet to fake a 1-prong tau, and p{j t^) 
denotes the probability for a jet to fake a 3-prong tau. For 1-prong taus, 

pU Ti) = Pil 7r+) p{tt+ r+) + 

p{q K+) p{K+ r+). (A.8) 

Similar equations hold for negatively charged taus. 

Figure A-4 shows the probability for a quark (or gluon) to fake a one-prong tau, 
as a function of transverse momentum. Using fragmentation functions tuned on 

162 



0.004 



p(q ^ ^) 
p(g ^ ^) 



0.003 



0.002 



0.001 



0, 




J I I I ttD I 

80 100 120 140 

Generated p.^ (GeV) 







20 



40 



60 



Figure A-4: The probability for a generated parton to be misreconstructed as a one- 
prong r, as a function of the parton's generated Pt- Red circles show the probability 
for a jet arising from a parent quark to be misreconstructed as a one-prong tau. Blue 
triangles show the probability for a jet arising from a parent gluon to be misrecon- 
structed as a one-prong tau. 

LEP 1 data, Pythia predicts the probability for a quark jet to fake a one-prong 
tau to be roughly four times the probability for a gluon jet to fake a one-prong tau. 
This difference in fragmentation is incorporated into Vista's treatment of jets faking 
electrons, muons, taus, and photons. The ViSTA correction model includes such 
correction factors as the probability for a jet with a parent quark to fake an electron 
(0033 and 0034) and the probability for a jet with a parent quark to fake a muon 
(0035); the probability for a jet with a parent gluon to fake an electron or muon is 
then obtained by dividing the values of these fitted correction factors by four. 

This effect is investigated using fake one-prong taus reconstructed in Pythia dijet 
samples. 

Figure A-5 shows that the reconstructed fake tau has about 75 ± 18% of the pt 
of the prominent generated particle, defined to be the generated particle carrying the 
greatest pt and being within a cone of AR < 0.4 centered on the reconstructed tau. 
The Pt of the misreconstructed tau is on average more undermeasured if the generated 
parton is a gluon than if it is a quark. This reduction in the pt of the fake tau is 
implemented in Vista when a jet is made to fake a r during the misreconstruction 
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Pt of fake t over pj of prominent generated particle 




Figure A-5: Distribution of the pt of the fake r over the pt of the prominent generated 
particle (pgp), which is defined as the generated particle within AR < 0.4 from the 
reconstructed r with the greatest Pt- The pgp is almost always a quark or a gluon, 
and more likely to be a quark by a factor of four. 

process. 

Figure A-6 shows the remaining generated Pt to be carried by neutral particles: 
mostly vr^'s, followed by K^^s and ?7's decaying to photons or to three neutral pions. 
The Pt of the fake tau is determined by the track and reconstructed vr^'s. 

The physical mechanism underlying the process whereby an incident photon or 
neutral pion is misreconstructed as an electron is a conversion in the material serving 
as the support structure of the silicon vertex detector. This process produces exactly 
as many e+ as e~, leading to 

^P{l ^ e) = p(7 e+) = p(7 e~) 

^p^^O ^ ^) ^ p^^O ^ ^4-) ^ p^^o ^ g-)^ ^A.9) 

where e is an electron or positron. 

From Fig. A.l, the average pt of electrons reconstructed from 25 GeV incident 
photons is 23.9 ±1.4 GeV. The average pt of electrons reconstructed from incident 
25 GeV neutral pions is 23.7 ± 1.3 GeV. 

The charge asymmetry between p{K'^ e~^) and p{K~ — > e~) in Table A.l arises 
because K~ can capture on a nucleon, producing a hyperon (S^*"), which does not 
produce, due to baryon number and strangeness conservation. Among the products 
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Figure A-6: Upper: The distribution of the pr of the prominent neutral generated 
particle (pngp), which is the neutral generated particle with the greatest px within 
a cone of AR < 0.4 from the fake one-prong r, divided by the pt of the prominent 
generated particle (pgp), which happens to be either a quark or a gluon. Lower: p^ 
of the pngp plus the px of the reconstructed r, divided by the pt of the pgp. The 
fact that this distribution peaks around 1 shows that the generated px that is missing 
from the fake r was carried by the pngp. Most of the times the pngp is a vr". 

of the hyperon decay are neutral pions, which decay electromagnetically and deposit 
in the electromagnetic calorimeter the energy needed to have a fake e~ . The absense 
of this process in K~^ + N interaction reduces the pooK~^e~^ relative pK~e~ by roughly 
a factor of two. 

The physical process primarily responsible for vr^ —)■ is inelastic charge ex- 
change 

n~p — > n^n 

7r+n^7r°p (A. 10) 

occurring within the electromagnetic calorimeter. The charged pion leaves the "elec- 
tron's" track in the CDF tracking chamber, and the 7r° produces the "electron's" 
electromagnetic shower. No true electron appears at all in this process, except as 
secondaries in the electromagnetic shower originating from the 7r°. 

The average pt of reconstructed "electrons" originating from a single charged 
pion is 18.8 ± 2.2 GeV, indicating that the misreconstructed "electron" in this case 
is measured to have on average only 75% of the total energy of the parent quark or 
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gluon. This is expected, since the recoihng nucleon from the charge exchange process 
carries some of the incident pion's momentum. 

An additional small loss in energy for a jet misreconstructed as an electron, photon, 
or muon is expected since the leading vr"*", K'^ , vr'^, or 7 takes only some fraction of 
the parent quark's energy. 

The cross sections for 'K~p n^n and n^n vr^p, proceeding through the isospin 
/ conserving and independent strong interaction, are roughly equal. The corre- 
sponding particles in the two reactions are related by interchanging the signs of their 
2;-components of isospin. 

The probability for a 25 GeV vr^ to decay to a /i"*" can be written 

p{7r^ = p(decays within tracker) + 

p(decays within calorimeter). (A. 11) 

The probability for the pion to decay within the tracking volume is 

p(decays within tracker) = 1 — e~-^t"'='*<=''/'^(''^\ (A. 12) 

where 7 = 25 GeV / 140 MeV = 180 is the pion's Lorentz boost, the proper decay 
length of the charged pion is (cr) = 7.8 meters, and the radius of the CDF track- 
ing volume is -Rtracker = 1-5 meters, giving p(decays within tracker) = 0.001. The 
probability for the pion to decay within the calorimeter volume is 

p(decays within calorimeter) ^ A//7(cr), (A. 13) 

where A/ ~ 0.4 meters is the nuclear interaction length for charged pions on lead 
or iron and the path length through the calorimeter is Lcai ~ 2 meters, leading 
to p(decays within calorimeter) ^ 0.00025. Summing the contributions from decay 
within the tracking volume and decay within the calorimeter volume, p^ir'^ fi~^) ~ 
0.00125. 

The primary physical mechanism by which a jet fakes a photon is for the par- 
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ent quark or gluon to fragment into a leading 7r° carrying nearly all the momentum. 
The highly boosted 7r° decays within the beam pipe to two photons that are suffi- 
ciently collinear to appear in the preshower, electromagnetic calorimeter, and shower 
maximum detector as a single photon. Thus 

p(j^7)=P(g^7r°)p(7r°->7). (A. 14) 

An immediate corollary is that the misreconstructed "photon" carries the energy of 
the parent quark or gluon, and is well measured. 

Since p{q — > 7r°) ^ p{q — 7), it follows from Eq. A. 4 and Table A.l that the 
conversion contribution to p{j ^ e) is ~ 75%, and the charge exchange contribution 
is ^ 25%: 

( p(g ^ 7)p(7 ^ e+) + 

p(g->7rO)p(7rO-.e+) )/ 

p{q K+) p{K+ ^ e+) ). (A.15) 



0.75 
(L25 



The number of j events in data is 0.9 times the number of e~ j events. This 
charge asymmetry arises from p{K^ — > e^) and p{K~ — > e~) in Table A.l. Quanti- 
tatively, 

pU ^ e+) ^ 0.9 + 0.2p{K+^e+)/p{K^e) 

p{j^e^) 0.9 + 0.2p{K- ^ e")/p{K ^ e)' ^ ' ^ 

where 0.9 is the sum of 0.75 from Eq. A.15 and 0.15 ~ 0.25 x 0.6 from Eq. A. 6, 
and 0.2 is twice 1 — 0.9. From p{K~^ — > e~^) and p{K~ —>■ e~) in Table A.l, p{K~^ — *• 
e^)/p{K — i> e) = 1/3 andp(i^^ e~)/p{K — e) = 2/3, predictingp(j — > e^)/p{j 
e") = 0.935, in reasonable agreement with the ratio of the observed number of events 
in the e"*" j and e~ j final states. 
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The number of j jj,'^ events observed in CDF Run II is 1.1 times the number of j j2~ 
events observed. This charge asymmetry arises from p{K^ — ^ /i"*") and p{K~ — > 
in Table A.l. 

The physical mechanism by which a prompt photon fakes a tau lepton is for the 
photon to convert, producing an electron or positron carrying most of the photon's 
energy, which is then misreconstructed as a tau. The probability for this to occur is 
equal for positively and negatively charged taus, 

^p(7 r) =p(7 ^ r+) =p(7 ^ r"), (A.17) 

and is related to previously defined quantities by 

P(7 ^ ^) = P(7 ^ e) — ^ rP(e^r), (A.18) 

p[e e) 

where p(7 — e) denotes the fraction of produced photons that are reconstructed as 
electrons, p{e — e) denotes the fraction of produced electrons that are reconstructed 
as electrons, and hence ^(7 — > e)/p{e — > e) is the fraction of produced photons that 
pair produce a single leading electron. 

Note p{e — ^ 7) ^ p(7 — > e) from Table A.l, as expected, with value of ~ 0.03 
determined by the amount of material in the inner detectors and the tightness of 
isolation criteria. A hard bremsstrahlung followed by a conversion is responsible for 
electrons to be reconstructed with opposite sign; hence 

p(e^ e^) = p{e^ e~) = p{e" e"*") 

~ ip(e±^7M7^e^), (A.19) 

where the factor of 1/2 comes because the material already traversed by the will 
not be traversed again by the 7. In particular, track curvature mismeasurement is 
not responsible for erroneous sign determination in the central region of the CDF 
detector. 

From knowledge of the underlying physical mechanisms by which jets fake elec- 
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trons, muons, taus, and photons, the simple use of a reconstructed jet as a lepton 
or photon with an appropriate fake rate apphed to the weight of the event needs 
shght modification to correctly handle the fact that a jet that has faked a lepton 
or photon generally is measured more accurately than a hadronic jet. Rather than 
using the momentum of the reconstructed jet, the momentum of the parent quark 
or gluon is determined by adding up all Monte Carlo particle level objects within a 
cone of AR = 0.4 about the reconstructed jet. In misreconstructing a jet in an event, 
the momentum of the corresponding parent quark or gluon is used rather than the 
momentum of the reconstructed jet. A jet that fakes a photon then has momentum 
equal to the momentum of the parent quark or gluon plus a fractional correction equal 
to 0.01 X (parent — 25 GeV)/(25 GeV) to account for leakage out of the cone of 
AR = 0.4, and a further smearing of 0.2 VGeV x ^^parent p^, reflecting the electro- 
magnetic resolution of the CDF detector. The momenta of jets that fake photons are 
multiplied by an overall factor of 1.12, and jets that fake electrons, muons, or taus 
are multiplied by an overall factor of 0.95. These numbers are determined by the i"^, 
£j, and 7j final states. The distributions most sensitive to these numbers are the 
missing energy and the jet p^. 

A b quark fragmenting into a leading b hadron that then decays leptonically or 
semileptonically results in an electron or muon that shares the px of the parent b 
quark with the associated neutrino. If all hadronic decay products are soft, the 
distribution of the momentum fraction carried by the charged lepton can be obtained 
by considering the decay of a scalar to two massless fermions. Isolated and energetic 
electrons and muons arising from parent b quarks in this way are modeled as having pt 
equal to the parent b quark px, multiplied by a random number uniformly distributed 
between and 1. 

A. 2 Additional background sources 

This appendix provides additional details on the estimation of the Standard Model 
prediction. 
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A. 2.1 Cosmic ray and beam halo muons 

There are four dominant categories of events caused by cosmic ray muons penetrating 
the detector: fi~^fi~ , '~f^, and j^. There is neghgible contribution from cosmic ray 
secondaries of any particle type other than muons. 

A cosmic ray muon penetrating the CDF detector whose trajectory passes within 
1 mm of the beam hne and within —60 < z < 60 cm of the origin may be reconstructed 
as two outgoing muons. In this case the cosmic ray event is partitioned into the final 
state If one of the tracks is missed, the cosmic ray event is partitioned into 

the final state fi^. The standard CDF cosmic ray filter, which makes use of drift time 
information in the central tracking chamber, is used to reduce these two categories of 
cosmic ray events. 

CDF data events with exactly one track (corresponding to one muon) and events 
with exactly two tracks (corresponding to two muons) are used to estimate the cosmic 
ray muon contribution to the final states and fi^fi~ after the cosmic ray filter. 
This sample of events is used as the SM background process cosmic fi. The cosmic fi 
sample does not contribute to the events passing the analysis offline trigger, whose 
cleanup cuts require the presence of three or more tracks. 

The remaining two categories are 71^ and j^, resulting from a cosmic ray muon 
that penetrates the CDF electromagnetic or hadronic calorimeter and undergoes a 
hard bremsstrahlung in one calorimeter cell. Such an interaction can mimic a single 
photon or a single jet, respectively. The reconstruction algorithm infers the presence 
of significant missing energy balancing the "photon" or "jet." If this cosmic ray 
interaction occurs during a bunch crossing in which there is a pp interaction producing 
three or more tracks, the event will be partitioned into the final state 'j^ or 

CDF data events with fewer than three tracks are used to estimate the cosmic 
ray muon contribution to the final states and j^. These samples of events are 
used as SM background processes cosmic 7 and cosmic j for the modeling of this 
background, corresponding to offline triggers requiring a photon with pt > 60 GeV, 
or a jet with px > 40 GeV (prescaled) or pt > 200 GeV (unprescaled) , respectively. 
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Figure A-7: The distribution of transverse momentum and azimuthal angle for pho- 
tons and jets in the 7/^ and final states, dominated by cosmic ray and beam halo 
muons. The vertical axis shows the number of events in each bin. Data are shown as 
filled (black) circles; the SM prediction is shown as the shaded (red) histogram. The 
prediction includes contributions from cosmic ray and beam halo muons, estimated 
using events containing fewer than three reconstructed tracks. The contribution from 
cosmic ray muons is flat in (j), while the contribution from beam halo is localized 
to = 0. The only degrees of freedom for the background to these final states are 
the cosmic 7 and cosmic j correction factors, whose values are determined from the 
global fit (Table 4.2). 
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These samples do not contribute to the events passing the analysis offline trigger, 
whose cleanup cuts require three or more tracks. The contribution of these events is 
adjusted with correction factors that are listed as cosmic 7 and cosmic j "/c-factors" 
in Table 4.2, but which are more properly understood as reflecting the number of 
bunch crossings with zero pp interactions (resulting in zero reconstructed tracks) 
relative to the number of bunch crossings with one or more interactions (resulting in 
three or more reconstructed tracks). 

The cosmic ray muon contribution to the flnal states '-/^ and is uniform as a 
function of the CDF azimuthal angle (p. Consider the CDF detector to be a thick 
cylindrical shell, and consider two arbitrary inflnitesimal volume elements at different 
locations in the material of the shell. Since the two volume elements have similar 
overburdens, the number of cosmic ray muons with > 20 GeV penetrating the 
flrst volume element is very nearly the same as the number of cosmic ray muons with 
E >20 GeV penetrating the second volume element. Since the material of the CDF 
calorimeters is uniform as a function of CDF azimuthal angle 0, it follows that the 
cosmic ray muon contribution to the flnal states '~f^ and should also be uniform as 
a function of (p. In particular, it is noted that the (p dependence of this contribution 
depends solely on the material distribution of CDF calorimeter, which is uniform in 
0, and has no dependence on the distribution of the horizon angle of the muons from 
cosmic rays. 

The flnal states 71^ and are also populated by beam halo muons, traveling 
horizontally through the CDF detector in time with a bunch. A beam halo muon 
can undergo a hard bremsstrahlung in the electromagnetic or hadronic calorimeters, 
producing an energy deposition that can be reconstructed as a photon or jet, respec- 
tively. These beam halo muons tend to lie in the horizontal plane and outside of 
the Tevatron ring, as if centrifugally hurled away from the beam; they horizontally 
penetrate the CDF detector along z at y = and x > 0, hence at = 0. 

Fig. A-7 shows the 71^ and flnal states, in which events come primarily from 
cosmic ray and beam halo muons. 
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A. 2. 2 Multiple interactions 

In order to estimate event overlaps, consider an interesting event observed in final 
state C, which looks like an overlap of two events in the final states A and B. An 
example is C=e+e-4j, A=e+e- and B=4j. It is desired to estimate how many C 
events are expected from the overlap of A and B events, given the observed frequencies 
of A and B. 

Let C{t) be the instantaneous luminosity as a function of time t; let 

L= C{t)dt = 1993 pb"^ (A.20) 

^Runll 

denote the total integrated luminosity; and let 

be the luminosity-averaged instantaneous luminosity. Denote by to the time interval 
of 396 ns between successive bunch crossings. The total number of effective bunch 
crossings X is then 

X = ^ ^ 5 X 10^\ (A.22) 

Letting A and B denote the number of observed events in final states A and B, it 
follows that the number of events in the final state C expected from overlap of A and 
B is 

^ A B AB , 

Overlap events are included in the SM background estimate, although their contri- 
bution is generally negligible. 



A. 2. 3 Intrinsic kr 

Significant discrepancy is observed in many final states containing two objects ol and 
o2 in the variables A0(ol ,o2), unci pxi and "pT- These discrepancies are ascribed to 
the sum of two effects: (1) an intrinsic Fermi motion of the colliding partons within 
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the proton and anti-proton, and (2) soft radiation along the beam axis. The sum 
of these two effects appears to be larger in Nature than predicted by Pythia with 
the parameter tunes used for the generation of the samples employed in this analysis. 
This discrepancy is well known from previous studies at the Tevatron and elsewhere, 
and affects this analysis similarly to other Tevatron analyses. 

The W and Z electroweak samples used in this analysis have been generated with 
an adjusted Pythia parameter that increases the intrinsic kx- For all other generated 
Standard Model events, the net effect of the Fermi motion of the colliding partons 
and the soft non-perturbative radiation is hypothesized to be described by an overall 
"effective intrinsic kx" and the center of mass of each event is given a transverse kick. 
Specifically, for every event of invariant mass m and generated summed transverse 
momentum YIpt, a random number kr is pulled from the probability distribution 

pikr) cc {kr < m/ 5) X [^gikr, fi = 0, ai) + 

lg{kT;f^ = 0,a2)], (A.24) 

where {kx < m/5) evaluates to unity if true and zero if false; g{kT', yU, cr) is a Gaussian 
function of k^ with center at fi and width a; cti = 2.55 GeV+0.0085 is the width 
of the core of the double Gaussian; and o"2 = 5.25 GeV + 0.0175 J2pt is the width of 
the second, wider Gaussian. The event is then boosted to an inertial frame traveling 



with speed 
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kT/m with respect to the lab frame, in a direction transverse to the 
beam axis, where m is the invariant mass of all reconstructed objects in the event, 
along an azimuthal angle pulled randomly from a uniform distribution between and 
27r. The momenta of identified objects are recalculated in the lab frame. Sixty percent 
of the recoil kick is assigned to unclustered momentum in the event. The remaining 
forty percent of the recoil kick is assumed to disappear down the beam pipe, and 
contributes to the missing transverse momentum in the event. This picture, and the 
particular parameter values that accompany this story, are determined primarily by 
the unci px and distributions in highly populated two-object final states, including 
the low-pT 2j final state, the high-p^ 2j final state, and the final states j'j, e'^e~ , and 
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Under the hypothesis described, reasonable although imperfect agreement with 
observation is obtained. The result of this analysis supports the conclusions of pre- 
vious studies indicating that the effective intrinsic needed to match observation 
is quite large relative to naive expectation. That the data appear to require such a 
large effective intrinsic niay be pointing out the need for some basic improvement 
to our understanding of this physics. 

A.3 Global fit 

This section describes the construction of the global used in the ViSTA global fit. 
A.3.1 The xl 

The bins in the CDF high-p^ data sample are labeled by the index k = {ki,k2), 
where each value of ki represents a phrase such as "this bin contains events with three 
objects: one with 17 < pt < 25 GeV and |?7| < 0.6, one with 40 < < 60 GeV and 
0.6 < |?7| < 1.0, and one with 25 < pt < 40 GeV and 1.0 < |?7|," and each value of k2 
represents a phrase such as "this bin contains events with three objects: an electron, 
muon, and jet, respectively." The reason for splitting k into ki and k2 is that a jet 
can fake an electron (mixing the contents of /C2), but an object with |?7| < 0.6 cannot 
fake an object with 0.6 < |?7| < 1.0 (no mixing of ki). The term corresponding to the 
k^^ bin takes the form of Eq. 3.1, where Data[A;] is the number of data events observed 
in the k^^ bin, SM[/c] is the number of events predicted by the Standard Model in 
the k^^ bin, (5SM[A;] is the Monte Carlo statistical uncertainty on the Standard Model 
prediction in the k^^ bin, and i^SMIaJJ is the statistical uncertainty on the prediction 
in the k^^ bin. To legitimize the use of Gaussian errors, only bins containing eight or 
more data events are considered. The Standard Model prediction SM[/i;] for the k^^ 
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bin can be written in terms of the introduced correction factors as 



E 



fe2'SobjcctLists 



(Gprocesses 



(/ Cdt) ■ (kFactor[/]) ■ (SMo[(A;i, A;2')][/]) ■ 
(probabilityToBeSoMisreconstructed[(A;i, A;2')] [/i;2]) ■ 



(probabilityPassesTrigger[(/ci, /C2)]), 



(A.25) 



where SM[/c] is the Standard Model prediction for the k^^ bin; the index k is the 
Cartesian product of the two indices ki and k2 introduced above, labehng the re- 
gions of the detector in which there are energy clusters and the identified objects 
corresponding to those clusters, respectively; the index k2' is a dummy summa- 
tion index; the index / labels Standard Model background processes, such as dijet 
production or W+1 jet production; SMo[(/i;i, /i;2')] [^] is the initial number of Stan- 
dard Model events predicted in bin {ki,k2) from the process labeled by the index 
/; probabilityToBeMisreconstructedThus[(/ci, A;2')][A;2] is the probability that an event 
produced with energy clusters in the detector regions labeled by ki that are identified 
as objects labeled by /C2' would be mistaken as having objects labeled by k2; and 
probabilityPassesTrigger[(/i;i, A;2)] represents the probability that an event produced 
with energy clusters in the detector regions labeled by ki that are identified as objects 
labeled by /c2 would pass the trigger. 



The quantity SMo[(A;i, ^2')][^] is obtained by generating some number rii (say 10^) 
of Monte Carlo events corresponding to the process /. The event generator provides 
a cross section ai for this process /. The weight of each of these Monte Carlo events is 
equal to cri/ni. Passing these events through the CDF simulation and reconstruction, 
the sum of the weights of these events falling into the bin {ki, k2) is SMo[(A;i, fc2')][^]- 
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A. 3. 2 Xconstraints 

The term Xconstraints (^) ^Q. 3.2 reflects constraints on the values of the correction 
factors determined by data other than those in the global high-pr sample. These 
constraints include A;-factors taken from theoretical calculations and numbers from 
the CDF literature when use is made of CDF data external to the Vista high-p^^ 
sample. The constraints imposed are: 

• The luminosity (0001) is constrained to be within 6% of the value measured by 
the CDF Cerenkov luminosity counters. 

• The fake rate p{q — > 7) (0039) is constrained to be 2.6 x 10~^± 1.5 x 10"^, from 
the single particle gun study of Appendix A.l. 

• The fake rate p{e 7) (0032) plus the efficiency p{e —>■ e) (0026) for electrons 
in the plug is constrained to be within 1% of unity. 

• Noting p{q 7) corresponds to correction factor 0039, p{q vr^) = 2p{q 
vr"), and p{q 7r°) = p{q — > 7)/p(7r° 7), and taking p(7r° ^ 7) = 0.6 and 
p{tt^ ^ t) = 0.415 from the single particle gun study of Appendix A.l, the fake 
rate p{q r) (0038) is constrained to p{q t) = p{q — » 7r='')j9(7r='' — > r) ± 10%. 

• The /c-factors for dijet production (0018 and 0019) are constrained to 1.10±0.05 
and 1.33 ± 0.05 in the kinematic regions pr < 150 GeV and pT > 150 GeV, 
respectively, where pT is the transverse momentum of the scattered partons in 
the 2 — >^ 2 process in the colliding parton center of momentum frame. 

• The inclusive /c-factor for 7+A^jets (0004-0007) is constrained to 1.25±0.15 [85, 
86]. 

• The inclusive /c-factor for 77+A^jets (0008-0010) is constrained to 2.0±0.15 [87]. 

• The inclusive /c-factors for W and Z production (0011-0014 and 0015-0017) are 
subject to a 2-dimensional Gaussian constraint, with mean at the NNLO/LO 
theoretical values [88], and a covariance matrix that encapsulates the highly 
correlated theoretical uncertainties, as discussed in Appendix A. 4. 
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Trigger efficiency correction factors are constrained to be less than unity. 



• All correction factors are constrained to be positive. 



A. 3. 3 Covariance matrix 

This section describes the correction factor covariance matrix S. The inverse of the 
covariance matrix is obtained from 



1 d\^^ 



2 ds.ds. 



(A.26) 



so 



where x^^s) is defined by Eq. 3.2 as a function of the correction factor vector s, 
vector elements Sj and Sj are the i^^ and j*^ correction factors, and sq is the vector of 
correction factors that minimizes x^{s)- Numerical estimation of the right hand side 
of Eq. A.26 is achieved by calculating x^ and at positions slightly displaced from 
So in the direction of the i^^ and j^^ correction factors, denoted by the unit vectors i 
and j. Approximating the second partial derivative 



leads to 



d Sid Si 



X^{sq + i5si+ j5sj) - x^{sg+ i^Sj) 
6sj6si 

X^iso + i5si) - x^jsp) 
6sj6si 



-1 



[x^{so + Ssii + Ssjj) 
-X^iso + 5sii) 



+x\so)]/{26s,6s, 



(A.27) 



for appropriately small steps 5si and 5sj away from the minimum. The covariance 
matrix S is calculated by inverting The diagonal element Sjj is the variance af 
of the i^^ correction factor, and the correlation pij between the and j^^ correction 
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Table A. 2: Correction factor correlation matrix. The top row and left column show correction factor codes. Each element of 
the matrix shows the correlation between the correction factors corresponding to the column and row. Each matrix element is 
dimensionless; the elements along the diagonal are unity; the matrix is symmetric; positive elements indicate positive correlation, 
and negative elements anti-correlation. 



factors is pij = T^ij/cTiaj. The variances of each correction factor, corresponding to the 
diagonal elements of the covariance matrix, are shown in Table 4.2. The correlation 
matrix obtained is shown in Table A. 2. 

A. 4 Correction factor values 

This section provides notes on the values of the Vista correction factors obtained from 
a global fit of Standard Model prediction to data. The correction factors considered 
are numbers that can in principle be calculated a priori, but whose calculation is in 
practice not feasible. These correction factors divide naturally into two classes, the 
first of which reflects the difficulty of calculating the Standard Model prediction to 
all orders, and the second of which reflects the difficulty of understanding from first 
principles the response of the experimental apparatus. 

The theoretical correction factors considered are of two types. The difficulty 
of calculating the Standard Model prediction for many processes to all orders in 
perturbation theory is handled through the introduction of /c-factors, representing the 
ratio of the true all orders prediction to the prediction at lowest order in perturbation 
theory. Uncertainties in the distribution of partons inside the colliding proton and 
anti-proton as a function of parton momentum are in principle handled through the 
introduction of correction factors associated with parton distribution functions, but 
there are currently no discrepancies to motivate this. 

Experimental correction factors correspond to numbers describing the response of 
the CDF detector that are precisely calculable in principle, but that are in practice 
best constrained by the high-p^ data themselves. These correction factors take the 
form of the integrated luminosity, object identification efficiencies, object misidenti- 
fication probabilities, trigger efficiencies, and energy scales. 

A. 4.1 A;-factors 

For nearly all Standard Model processes, /c-factors are used as an overall multiplicative 
constant, rather than being considered to be a function of one or more kinematic 
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Figure A-8: Variation of the /c-factors for inclusive W and Z production under dif- 
ferent choices of parton distribution functions, from the Alekhin parton distribution 
error set [89] . The correlation of the uncertainty on these two /c-factors due to uncer- 
tainty in the parton distribution functions is 0.955. 

variables. The spirit of the approach is to introduce as few correction factors as 
possible, and to only introduce correction factors motivated by specific discrepancies. 



0001. The integrated luminosity of the analysis sample has a close relationship 
with the theoretically determined values of inclusive W and Z production at the 
Tevatron. Figure A-8 shows the variation in calculated inclusive W and Z /c-factors 
under changes in the assumed parton distribution functions. Each point represents 
a different W and Z inclusive cross section determined using modified parton dis- 
tribution functions. The use of 16 bases to reflect systematic uncertainties results 
in 32 black dots in Fig. A-8. The uncertainties in the W and Z cross sections due 
to variations in the renormalization and factorization scales are nearly 100% corre- 
lated; varying these scales affects both the W and Z inclusive cross sections in the 
same way. The uncertainties in the parton distribution functions and the choice of 
renormalization and factorization scales represent the dominant contributions to the 
theoretical uncertainty in the total inclusive W and Z cross section calculations at 
the Tevatron. The term in Xconstramts ^^^^ reflects our knowledge of the theoretical 
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Figure A-9: Calculation of the 77^ fc-factor, as a function of jet transverse momentum. 
The effect of changing the factorization scale by a factor of two in either direction is 
also shown (small black points with error bars). 



prediction of the inclusive W and Z cross sections explicitly acknowledges this high 
degree of correlation. 

Theoretical constraints on all other fc-factors are assumed to be uncorrelated with 
each other, not because the uncertainties of these calculations are indeed uncorrelated, 
but rather because the correlations among these computations are poorly known. 



0002, 0003. The cosmic 7 and cosmic j backgrounds are estimated using events 
recorded in the CDF data with one or more reconstructed photons and with two or 
fewer reconstructed tracks. The use of events with two or fewer reconstructed tracks 
is a new technique for estimating these backgrounds. These correction factors are 
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primarily constrained by the number of events in the ViSTA 7^ and j'p final states. 
The values are related to (and consistent with) the fraction of bunch crossings with 
one or more inelastic pp interactions, complicated slightly by the requirement that 
any jet falling in the final state has at least 5 GeV of track px within a cone of 0.4 
relative to the jet axis. 

0004, 0005, 0006, 0007. The NLOJET++ calculation of the 7j inclusive /c-factor 
constrains the cross section weighted sum of the 7j, 72j, 78^, and 74j correction 
factors to 1.25 ±0.15 [85, 86]. 

0008, 0009, 0010. The DIPHOX calculation of the inclusive 77 cross section at 
NLO constrains the weighted sum of these correction factors to 2.0 ± 0.15 [87]. From 
Table 4.2, the 77^ fc-factor (0009) appears anomalously large. Figure A-9 shows a 
calculation of this 77^ fc-factor using NLOJET++ [85, 86] as a function of summed 
transverse momentum. The NLO correction to the LO prediction is found to be large, 
and not manifestly inconsistent with the value for this /c-factor determined from the 
Vista fit. The cross section for 772j production has not been calculated at NLO. 

0011, 0012, 0013, 0014. These correction factors correspond to fc-factors for W 
production in association with zero, one, two, and three or more jets, respectively. A 
linear combination of these correction factors is constrained by the requirement that 
the inclusive W production cross section is consistent with the NNLO calculation 
of Ref . [89] . The values of these correction factors, and their trend of decreasing as 
the number of jets increases, depends heavily on the choice of renormalization and 
factorization scales. The individual correction factors are not explicitly constrained 
by a NLO calculation. 

0015, 0016, 0017. These correction factors correspond to /c-factors for Z production 
in association with zero, one, and two or more jets, respectively. A linear combination 
of these correction factors is constrained by the requirement that the inclusive Z 
production cross section is consistent with the NNLO calculation of Ref. [89]. 

183 



0018, 0019. The two fc-factors for dijet production correspond to two bins in pTi 
the Pt of the hard two to two scattering in the parton center of mass frame. These 
correction factors are constrained by a NLO calculation [90], and show expected 
behavior function of pr- 

0020, 0021. The two /c-factors for 3-jet production, corresponding to two bins in 
pTi are unconstrained by any NLO calculation, but show reasonable behavior as a 
function of px- 

0022, 0023. The /c-factors for 4-jet production, corresponding to two bins in pT-, are 
unconstrained by any NLO calculation, but show reasonable behavior as a function 
of pT- 

0024. The fc-factor for the production of five or more jets, constrained primarily by 
the Vista low-p^ 5j final state, is found to be close to unity. 

A. 4. 2 Identification efficiencies 

The correction factors in this section, although billed as "identification efficiencies," 
are truly ratios of the identification efficiency in the data relative to the identification 
efficiency in CDFsiM. A correction factor value of unity indicates a proper modeling 
of the overall identification efficiency by CDFsiM; a correction factor value of 0.5 
indicates that CDFsiM overestimates the overall identification efficiency by a factor 
of two. 

0025. The central electron identification efficiency scale factor is close to unity, 
indicating the central electron efficiency measured in data is similar (to within 1%) 
to the central electron efficiency in the CDF detector simulation. This reflects an 
emphasis within CDF on tuning the detector simulation for central electrons. The 
determination of this correction factor is dominated by the ViSTA flnal states e|5 and 
e"'"e~, where one of the electrons has \rj\ < 1. 
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0026. The plug electron identification efficiency scale factor is several percent less 
than unity, indicating that the CDF detector simulation slightly overestimates the 
electron identification efficiency in the plug region of the CDF detector. The deter- 
mination of this correction factor is dominated by the Vista final states e'f) and e^e~ ^ 
where one of the electrons has |?7| > 1. 

0027, 0028. To reduce backgrounds hypothesized to arise from pion and kaon decays 
in flight with a substantially mismeasured track, a very good track fit in the CDF 
tracker is required. Partially due to this tight track fit requirement, CDF muon 
identification efficiencies in the regions |?7| < 0.6 and 0.6 < |?7| < 1.5 are overestimated 
in the CDF detector simulation by over 10%. The determination of the identification 
efficiencies \i) is dominated by the Vista final states \if and \i~ ■ 

0029 . The central photon identification efficiency scale factor is determined primar- 
ily by the number of events in the ViSTA final states j7 and 77. The uncertainty on 
this correction factor is highly correlated with the uncertainties on the 7j /c-factor, 
the p(j — i> 7) fake rate, and the 77 /c-factor. 

0030 . The plug photon identification efficiency scale factor is determined primarily 
by the number of events in the Vista final state 77. The uncertainty on this correction 
factor is highly correlated with the uncertainty on the plug p(j — ^ 7) fake rate. 

0031. The 6-jet identification efficiency is determined to be consistent with the 
prediction from CDFsiM. 

A. 4. 3 Fake rates 

0032 . The fake rate p(e — > 7) for electrons to be misreconstructed as photons in 
the plug region of the detector is added on top of the significant number of electrons 
misreconstructed as photons by CDFsiM. 
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0033 . In Vista, the contribution of jets faking electrons is modeled by applying a 
fake rate p{j — > e) to Monte Carlo jets. Vista represents the first large scale Tevatron 
analysis in which a completely Monte Carlo based modeling of jets faking electrons 
is employed. Significant understanding of the physical mechanisms contributing to 
this fake rate has been achieved, as summarized in Appendix A.l. Consistency with 
this understanding is required; for example, p{j e) ^ p{j l)p{l —>■ e). The 
value of this correction factor is determined primarily by the number of events in the 
Vista final state ej, where the electron is identified in the central region of the CDF 
detector. It is notable that this fake rate is independent of global event properties, 
and that a consistent simultaneous understanding of the ej, e2j, e3j, and e4j final 
states is obtained. 

0034 . The value of the fake rate p{j e) in the plug region of the CDF detector 
is roughly one order of magnitude larger than the corresponding fake rate p{j e) 
in the central region of the detector, consistent with an understanding of the relative 
performance of the detector in the central and plug regions for the identification of 
electrons. This correction factor is determined primarily by the number of events in 
the Vista final state ej, where the electron is identified in the plug region of the CDF 
detector. 

0035 . In Vista, the contribution of jets faking muons is modeled by applying a fake 
rate p{j — > fi) to Monte Carlo jets. Vista represents the first large scale Tevatron 
analysis in which a completely Monte Carlo based modeling of jets faking muons is 
employed. The value obtained from the Vista fit is seen to be roughly one order of 
magnitude smaller than the fake rate p{j — e) in the central region of the detector, 
consistent with our understanding of the physical mechanisms underlying these fake 
rates, as described in Appendix A.l. The value of this correction factor is determined 
primarily by the number of events in the ViSTA final state jfi. 

0036. The fake rate p{j b) has pt dependence explicitly imposed. The number 
of tracks inside a typical jet, and hence the probability that a secondary vertex is 
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(mis)reconstructed, increases with jet pt- The values of these correction factors are 
consistent with the mistag rate determined using secondary vertices reconstructed on 
the other side of the beam axis with respect to the direction of the tagged jet [91]. 
The value of this correction factor is determined primarily by the number of events 
in the Vista final states bj and bb. 

0037, 0038. The fake rate p{j — > r) decreases with jet pr, since the number of 
tracks inside a typical jet increases with jet pt- The values of these correction factors 
are determined primarily by the number of events in the Vista final state jr. 

0039, 0040. The fake rate p{j 7) is determined separately in the central and plug 
regions of the CDF detector. The values of these correction factors are determined 
primarily by the number of events in the ViSTA final states j'-f and 77. The value 
obtained for 0039 is consistent with the value obtained from a study using detailed 
information from the central preshower detector. The fake rate determined in the 
plug region is noticeably higher than the fake rate determined in the central region, 
as expected. 

AAA Trigger efficiencies 

0041. The central electron trigger inefficiency is dominated by not correctly recon- 
structing the electron's track at the first online trigger level. 

0042. The plug electron trigger inefficiency is due to inefficiencies in clustering at 
the second online trigger level. 

0043. 0044. The muon trigger inefficiencies in the regions |?7| < 0.6 and 0.6 < |?7| < 
1.0 derive partly from tracking inefficiency, and partly from an inefficiency in recon- 
structing muon stubs in the CDF muon chambers. 

The value of these corrections factors are consistent with other trigger efficiency mea- 
surements made using additional information [92]. 
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A. 4. 5 Energy scales 

The Vista infrastructure also allows the jet energy scale to be treated as a correction 
factor. At present this correction factor is not used, since there is no discrepancy 
requiring it. 

To understand the effect of introducing such a correction factor, a jet energy scale 
correction factor is added and constrained to 1 ± 0.03, reflecting the jet energy scale 
determination at CDF [50]. The fit returns a value with a very small error, since 
this correction factor is highly constrained by the low-p^ 2j, 3j, ej, and e2j final 
states. Assuming perfectly correct modeling of jets faking electrons, as described 
in Appendix A.l, this is a correct energy scale error. The inclusion of additional 
correction factor degrees of freedom to reflect possible imperfections in this modeling 
of jets faking electrons increases the energy scale error. The interesting conclusion is 
that the jet energy scale (considered as a lone free parameter) is very well constrained 
by the large number of dijet events; adjustment to the jet energy scale must be 
accompanied by simultaneous adjustment of other correction factors (such as the 
dijet /c-factor) in order to retain agreement with data. 

A. 5 Sleuth details 

This appendix elaborates on the Sleuth partitioning rule, and on the minimum 
number of events required for a final state to be considered by Sleuth. 

A. 5.1 Partitioning 

Table A. 3 lists the Vista final states associated with each Sleuth final state. 
A. 5. 2 Minimum number of events 

This section expands on a subtle point in the definition of the Sleuth algorithm: 
for purely practical considerations, only final states in which three or more events are 
observed in the data are considered. 
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Table A. 3: Correspondence between Sleuth and Vista final states. The first column 
shows the Sleuth final state formed by merging the populated Vista final states in 
the second column. Charge conjugates of each Vista final state are implied. 
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Suppose Pg+e-fob = 10~^; then in computing V all final states with h > 10~^ must 
be considered and accounted for. (A final state with b = 10"'', on the other hand, 
counts as only ^0.1 final states, since the fraction of hypothetical similar experiments 
in which V < 10~^ in this final state is equal to the fraction of hypothetical similar 
experiments in which one or more events is seen in this final state, which is 10~^.) 
This is a large practical problem, since it requires that all final states with b > 10~^ 
be enumerated and estimated, and it is difficult to do this believably. 

To solve this problem, let SLEUTH consider only final states with at least (imin 
events observed in the data. The goal is to be able to find V < 10^^. There will be 
some number A'^fs(6min) of final states with expected number of events b > 6min, writing 
A^fs explicitly as a function of femin! thus 6min must be chosen to be sufficiently large 
that all of these Afs(6min) final states can be enumerated and estimated. The time 
cost of simulating events is such that the integrated luminosity of Monte Carlo events 
is at most 100 times the integrated luminosity of the data; this practical constraint 
restricts 6min > 0.01. The number of Sleuth Tevatron Run II final states with 

b > 0.01 is Nkibrain = 0.01) ^ 10^. 

For small Vmin, keeping the first term in a binomial expansion yields V = VmmN{s{b„ 
where Vmm is the smallest V found in any final state. From the discussion above, the 
computation of V from Vmin can only be justified if Vmin > (^min'^™'"); if otherwise, 
final states with b < b^in "will need to be accounted for. Thus V can be confidently 

computed only if V > ( &min ^fs(&min) • 

Solving this inequality for dmin and inserting values from above, 

dmin > ^"gio^-^"gio^^-(^-) ^ ±Z1 = 3. (A.28) 

logio Omin -2 

A believable trials factor can be computed if dmin ^ 3. 

At the other end of the scale, computational strength limits the maximum number 
of events Sleuth is able to consider to < 10^. Excesses in which the number of events 
exceed 10"^ are expected to be identified by Vista's normalization statistic. 
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A. 5. 3 j9-valinin, population and V 

Sleuth estimates V for a given final state by producing pseudo-data, i.e. values 
that are distributed according to the Standard Model prediction. It then scans all 
'Y^Pt tails, finds the smallest p-val and compares it to the p-valmin from the actual 
data. That is repeated with many different distributions of pseudo-data, until the 
fraction of more interesting pseudo-data distributions (which is V) is determined with 
5% relative uncertainty. 

In each pseudo-data distribution that is produced, the population of pseudo-data is 
randomly distributed according to a Poisson distribution, whose mean is the Standard 
Model predicted total population [B] for the final state. 

Each examined ^ px tail has a p-val that is not taking into account the statistical 
uncertainty in the background (6) contained in the tail. The same is true for both 
data and pseudo-data, therefore the effect in the final V is negligible. 

Regardless of the particular shape of an expected 'Y^pr distribution, p-valmin in 
pseudo-data follows the same distribution. Therefore, V depends only on the p-valmin 
observed in data, and on the overall expected population; the larger the population, 
the bigger the average number of considered ^ pt tails in pseudo-data, therefore the 
larger the V. The dependence of V on p-val^in and B is shown in Fig. A-10. The 
advantage of having tabulated this dependence, is that then one does not have to 
produce pseudo-data repeatedly to estimate P; he can simply read it from Fig. A-10, 
for a given B and p-valmin. This technique makes the execution of Sleuth incredibly 
fast, allowing for studies such as sensitivity tests, projections to different luminosity, 
propagation of systematic uncertainties to V, and frequent assessment of the 
excesses in data. 
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Figure A-10: "P as a function of p-valmin, for final states of different expected popula- 
tions B. V reaches a plateau at p-valmax = Yl^z ^ which is visible for small B, 
and reflects the requirement to have at least 3 data events in a tail to consider 
it. V values have been estimated to 5% relative uncertainty. 
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Appendix B 

Correction Model Details, 
reflecting the 2 fb~^ analysis 

B.l Details on Event Selection 

Although specific onhne triggers are not exphcitly required, it is still possible to 
identify the primary online triggers which feed this analysis. These are: 

• electron_central_18 

• muon_central_18 

• photon_25_iso 

• jet20 

• jetlOO 

• susy dilepton triggers: electron_central_8_&_track8 cem4_cmup4 cem4_cmx4 
cem4_pem8 cmup4_pem8 cmx4_pem8 dielectron_cerLtral_4 dimuon_cmup4_cmx4 
dimuon_cmupcmup4 

• susy dilepton triggers muon_cmup8_&_track8 and muon_cmx8_&_track8 (intro- 
duced in run number 200274, roughly 600 pb~^ into Run II) 
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• hadronic ditau trigger (introduced roughly 300 pb~^ into Run II) 
The following datasets were used: 

• HighPt Central Electron stream: bhelOd, bhelOh, bhelOi, bhelOj 

• HighPt CMUP and CMX muon stream: bhmuOd, bhmuOh, bhmuOi, bhmuOj 

• HighPt Photon stream: cphlOd, cphlOh, cphlOi, cphlOj 

• SUSY dilepton stream: edilOd, edilOh, edilOi, edilOj 

• Ditau stream: etauOd, etauOh, etauOi, etauOj 

• Jet20 stream: gjtlOd, gjtlOh, gjtlOi, gjtlOj 

• JetlOO stream: gjt40d, gjt40h, gjt40i, gjt40j 

B.2 Details on Particle Identification 

This section contains tables of information related to particle identification. Elec- 
tron identification is described in Tables B.l and B.2; muon identification in Ta- 
bles B.3, B.4, B.5, and B.6; tau identification in Table B.7; and photon identification 
in Tables B.8 and B.9. Standard fiducial criteria apply. Standard CDF SecVtx algo- 
rithm is used to identify 6-jets. 

Jets are identified using the JetClu [49] clustering algorithm with cone size AR = 
0.4, unless the event contains one or more jets with pt > 200 GeV and no leptons or 
photons, in which case cones of AR = 0.7 are used. Jets with px > 150 GeV are 
required to have at least 5 GeV of track p^ within the cone. 

B.3 Vista: Single Particle Gun Results 

Tables B.IO and B.ll show the response of the CDF detector simulation, reconstruc- 
tion, and particle identification algorithms to single particles in the central and plug 
regions respectively, with all changes to particle identification criteria discussed in 
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T r * 1 1 

Variable 


Cut 


Region 


Fiducial CES 


irack Zo 


< dU cm 


COT Ax. Seg. 


> 3 


COT St. Seg. 


> 2 


Signed CES AX 


-3.0 < gAX < 1.5 




< 3.0 cm 




^ lU vjc V / C 


Pt/Et (if Pt < 50) 


> 0.5 


Had/Em 


< 0.055 + 0.00045 x E 


Isolation 


< 0.1 X Et 


LShrlVk 


< 0.2 


CES StripChi2 


< 10.0 


Conversion 


FALSE 



Table B.l: Central electron identification criteria used in Vista and Sleuth. These 
correspond to TightCEM electrons as defined in [93], for Gen5 and Gen6. The con- 
version finder flags a second track with |AXy| < 0.2 cm, |Acot^| < 0.015, and 
Pt>0 GeV. 



Variable 


Cut 


Region 


\r]\ < 2.6 


Had/Em 


< 0.05 


Isolation 


< 0.1 X Et 


PEM Chi2 


< 10 


PES U 


> 0.65 


PES V 


> 0.65 


PHX Track 


TRUE 


N SVX hits 


> 3 


deltaR(PHX Track,EM cluster) 


< 0.01 



Table B.2: Plug electron identification criteria used in Vista and Sleuth. These 
correspond to Tight Phoenix electrons as defined in [93], except for the cut on AR. 
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Variable 


Cut 


Larry curvature correction 


Apphed 






Track Zq 


< 60 cm 






COT Ax. Seg. 


> 3 






COT St. Seg. 


> 3 






Iso/pt 


< 0.1 






EM + Had Energy 


> 0.1 GeV 






EM Energy 


< 2.0 + 0.0115(p 


- 100) 


X {p> 100) 


Had Energy 


< 6.0 + 0.0280(j9 


- 100) 


X {p> 100) 


COT (genS) 


< 3 for pt < 60; 


< 2 for 


Pt > 60 


COT (gen6) 


< 2 






Track With Si hits 


\do\ < 0.02 






Track Without Si hits 


\do\ < 0.2 







Table B.3: Common muon identification criteria used in Vista and Sleuth. 



Variable 


Cut 


CMU AX 


< 7 cm 


CMP AX 


< 5 cm 


CMUP Fiducial 


TRUE 


No bluebeam muons 


For Runs < 154449 



Table B.4: CMUP muon identification criteria used in Vista and Sleuth. These 
are in addition to the criteria common to all muons. 



Variable 


Cut 


CMX AX 


< 6 cm 


COT exit radius 


> 140 cm 


CMX Fiducial 


TRUE 


Run 


> 150144 


Keystone and Miniskirt good 


run > 190697 


Exclude wedge 14, west 


For 190697 < run < 209760 



Table B.5: CMX Muon identification criteria used in Vista and Sleuth. These are 
in addition to the criteria common to all muons. 



Variable 


Cut 


BMU AX 


< 10 cm 


BMU Fiducial 


TRUE 



Table B.6: BMU Muon identification criteria used in Vista and Sleuth. These are 
in addition to the criteria common to all muons. 
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±i cLUiY loUicLLlUii I g,t:iiO I 


IM o "("T a 1 Tin-. ^ 1 1 in 




annulus 10-30° 


Track isolation (gen6) 


SuniPt of trrirks in rinuulus 




2-0 4 rad < 1 


Caloriineter Isolation 


Iso/Et < 0.1 


liPloTlTTlP'I'PT' r,rn 


Cal /^T- < VisPt + 1 ^\/VisPt 


'\/'ic'iV^1q 1\ /T o o o ^"i" V n /^l^ o _I_ -tt^ o 
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^ i.O Vjfc! V 


1 y Q /V „ 

iiacK iuqI 




OccLl ±iaCK VtJiLc-A. V^vUlloloLtJllCy 


-TV Uol OCCU. ±1 cxL-iS. ^ - 






One Prong Tau 


N tracks in 10° cone =1 


Electron removal 


^ > 0.1 and 




EMfraction < 0.925 


Not a Muon 


No matching muon stubs and 




Cal Et/ Seed Track pt > 0.5 



Table B.7: Table of r identification criteria used in Vista and Sleuth. 



Variable 


Cut 


Fiducial Region (X) 




CES \X\ < 21 cm 


Fiducial Region (Z) 




9 < CES Z < 230 cm 


Had/Em 




< 0.125 II < 0.055 + 0.00045 x E 


Isolation {Et < 20) 




< 0.1 X Et 


Isolation {Et > 20) 




< 2.0 + 0.02 X {Et - 20) 


Track isolation, cone 0.4 




SumPt < 2 + 0.005 x Et 


Ntrack (N3D) 




N3D < 1 


Track pt (if N3D=1) 




< 1.0 + 0.005 X Et 


Chi2 (Strips+Wires)/2.0 




< 20 


2nd CES clus. E sin 9 {Et 


< 18) 


strip-|-wire < 0.14 x Et 


2nd CES clus. E sin 6 {Et 


> 18) 


strip-hwire < 2.52 -f- 0.01 x Et 



Table B.8: Central photon identification criteria used in Vista and Sleuth. Here 
Et refers to corrected photon Et- The "2nd CES Cluster" cut is tighter than the 
standard photon cut. 
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Variable 


Cut 


Rpcfion 


1 2 < lr)l < 2 6 


Had/Em (Et < 100) 


< 0.05 


Had /Em (Err > 100^ 


< n 05 -1- 026 X Lnn( Err/100) 


Tsolation ( E-r < 20) 


< 1 X Err 


Tsolation ( Err > 20^ 


< 2 -)- 02 X Err 


Track Isolation (in a cone of dR < 0.4) 


< 2.0 + 0.005 X 


PEM Chi2 


< 10 


PES U 


> 0.65 


PES V 


> 0.65 



Table B.9: Plug photon identification criteria used in Vista and Sleuth. Here Et 
refers to corrected photon Et- These are the standard Joint Physics cuts. 

section 4.2.1. We use a single particle gun to shoot 10^ particles of each type, with 
Pt = 25 GeV, uniformly distributed in 6 and 0. The types of generated particles label 
the rows, while the resulting reconstructed objects label the columns of each table. 
Table B.12 shows a similar study with 10^ particles at pt = 50 GeV. These results 
are not directly used in the analysis, but provide a sensible cross-check for the used 
fake rates and identification efficiencies. 

It should be noted that the number of photons reconstructed as electrons decreased 
compared to the last round of this analysis. As expected, the number of electrons 
which were identified with the wrong charge has decreased proportionately, as well 
as the number of vr*^ reconstructed as electrons. All these are results of making the 
conversion filter tighter, by removing the lower pt threshold that was previously 
required when looking for sibling tracks coming from conversion. 

Figures B.3 and B.3 show the pt distributions of the reconstructed object (column 
label), resulting from the initial particle (row label), for the central and plug region 
of the detector respectively. We note that the pt resolution of reconstructed rs has 
worsened, consistently with obtaining pt from calorimeter Et rather than visible 
momentum. 
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6336 


10 


652 


55677 


613 




17 


6623 





4907 


9 


6064 


615 


56201 


580 




12 


8 


2 





658 


29 


247 


98645 


24 




4 


16 


1 


1 


55 


428 


181 


98916 


21 




10 


8 





4 


29 


31 


12 


98190 


99 



Table B.IO: Central single particle misidentification matrix. Using a single particle 
gun, 10^ particles of each type shown at the left of the table are shot with pt = 
25 GeV into the central CDF detector, uniformly distributed in 6 and in 0. The 
resulting reconstructed object types are shown at the top of the table, labeling the 
table columns. Thus the rightmost element of this matrix in the fourth row from the 
bottom shows p(r~ j), the number of negatively charged tau leptons (out of 10^) 
reconstructed as a jet. 
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e+ 


e 






r+ 


T 


7 


J 


b 


e+ 


3737 


26 








1 





71307 


24597 





e~ 


24 


3834 











4 


71003 


24789 













10661 





3 





1 


1061 





/i 











10678 





2 


3 


1127 





7 


55 


65 














76374 


23064 





tt" 


46 


53 











1 


74111 


25522 





7r+ 


17 





16 





4395 


2 


554 


93462 


25 


7r~ 


1 


24 





10 


3 


4206 


673 


93570 


20 


ir+ 


13 





59 





4658 





421 


92807 


12 


Ji- 





36 





38 


1 


4301 


834 


92958 


23 


B+ 


50 


2 


102 


3 


4 





186 


90077 


7389 


B- 


3 


18 


2 


81 





13 


160 


90178 


7347 




52 


12 


96 


15 


3 


8 


148 


90241 


7016 


5° 


10 


52 


8 


107 


4 


5 


126 


90149 


7095 




32 


4 


90 





136 


11 


738 


94326 


2148 


D- 


2 


22 


1 


57 


6 


127 


817 


94367 


2100 


D° 


9 


7 


38 


2 


20 


74 


671 


96983 


1027 


DO 


2 


15 


3 


37 


74 


17 


628 


96928 


1121 




1 











6 


8 


1089 


97411 


6 




2 


3 








11 


39 


9532 


56689 





r+ 


339 


8 


1249 


1 


341 


2 


3198 


66243 


104 




5 


346 





1226 





336 


3208 


66111 


108 




19 


12 


2 





73 


13 


423 


99359 


47 




13 


17 








15 


36 


359 


99357 


60 




7 


11 


8 


1 


19 


18 


41 


98937 


426 



Table B.ll: Plug single particle misidentification matrix. Using a single particle gun, 
10^ particles of each type shown at the left of the table are shot with pt = 25 GeV 
into the plug CDF detector, uniformly distributed in 6 and in 0. The resulting recon- 
structed object types are shown at the top of the table, labeling the table columns. 
Thus the rightmost element of this matrix in the fourth row from the bottom shows 
p{t~ — > j), the number of negatively charged tau leptons (out of 10^) reconstructed 
as a jet. 
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Figure B-1: Transverse momentum distribution of reconstructed objects (labelling 
columns) arising from single particles (labelling rows) with pt = 25 GeV shot from a 
single particle gun into the central CDF detector. The area under each histogram is 
equal to the number of events in the corresponding misidentification matrix element 
of Table B.IO; histograms with fewer than ten events are not shown. The horizontal 
axis ranges from to 50 GeV, with one tick mark each 5 GeV. The incident single 
particle distribution is a delta function at the center of each plot, at pt = 25 GeV. 
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Figure B-2: Transverse momentum distribution of reconstructed objects (labelling 
columns) arising from single particles (labelling rows) with = 25 GeV shot from 
a single particle gun into the plug CDF detector. The area under each histogram is 
equal to the number of events in the corresponding misidentification matrix element 
of Table B.ll; histograms with fewer than ten events are not shown. The horizontal 
axis ranges from to 50 GeV, with one tick mark each 5 GeV. The incident single 
particle distribution is a delta function at the center of each plot, at px = 25 GeV. 
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e+ 


6060 
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38 





139 


3576 
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e~ 





6103 
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128 
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1 





5201 





2 


1 


301 





7 


55 


75 














6554 


3118 







42 


38 














5751 


3991 





7r+ 


19 





9 





7721 





2 


2089 


19 







15 





4 


3 


7761 


2 


2100 


9 




10 





20 





7662 


2 


2 


2109 


5 







25 





11 


5 


7682 


3 


2119 


10 


B+ 


18 


2 


6 





13 


1 





5160 


4792 


B- 


1 


9 





6 


1 


10 


1 


5186 


4776 




13 


3 


5 





13 


9 





5111 


4836 


SO 





11 





3 


7 


7 





5095 


4868 




41 


1 


20 





726 


10 


14 


7861 


1241 


D- 


2 


52 


1 


11 


10 


696 


10 


7898 


1247 


D° 


9 


4 


7 


2 


31 


93 


25 


9036 


767 


DO 


3 


11 





9 


87 


39 


24 


9081 


693 


Kl 














7 


7 


13 


9849 


16 


Kl 





1 








18 


69 


810 


5855 


1 


r+ 


889 


2 


711 





1506 


1 


63 


5008 


191 


T~ 


3 


816 





672 


1 


1496 


107 


5088 


188 


U 


1 


1 








46 


2 


19 


9923 


31 


d 


1 











3 


35 


7 


9944 


24 


9 


1 


1 





2 


3 


3 





9897 


174 



Table B.12: Central single particle misidentification matrix. Using a single particle 
gun, lO'' particles of each type shown at the left of the table are shot with px = 
50 GeV into the central CDF detector, uniformly distributed in 6 and in 0. The 
resulting reconstructed object types are shown at the top of the table, labeling the 
table columns. 
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B.4 Fake Rates 



It would take too many Monte Carlo events to acquire enough statistics of rare fake 
processes. To overcome this difficulty, we apply our own multiplicative fake rates on 
reconstructed objects, when they are reconstructed more often than the objects thay 
may fake. Specifically, we apply fake rates for jets or b-tagged jets faking electrons, 
muons, photons, rs, jets faking b-tagged jets, and photons faking electrons. Note 
that other fake processes are not neglected - they are handled by CDFSim. In the 
interest of simplicity, we try to keep our fake rates as simple as possible. There is 
generally one overall coefficient for the fake rate, and this value is usually obtained 
from the ViSTA fit to the data. In some cases however, to better model the true fake 
process, we need to introduce additional modulations as a function of px or location 
within the detector (77 or (p). This section details all the special modulations applied 
for Vista fake rates. Generally, we show a modulating function, which multiplies the 
appropriate correction factor value to obtain the true fake rate applied. If not shown 
here, the fake rate is treated as being constant. 

Figures B-4 and B-5 show the relative fake rate for jets to fake electrons as a func- 
tion of ?7dct and (j). These functions of r/det and (p are multiplied by overall correction 
factors which represent a crude average fake rate over the appropriate region. These 
shaped functions are meant to model more fine details in fake rates than the overall 
average can contain. In addition to ?7det and dependence, for plug electrons there 
is a dependance on the px, shown in Figure B-3. Figures B-6, B-7, and B-8 show 
the electron p^, electron ?7det and distribution from data in the le+lj final state, 
where almost all events come from QCD dijet production where one of the jets fakes 
an electron. This serves as the dominant control region for determining variations in 
jet to electron fake rate. 

Figures B-9 and B-10 show the fake rate variation for jets to fake muons as a 
function of pt and ?7det- The fake rate is higher in CMX than in CMU and CMP. The 
muon Pt, f]dct, and distributions in the ljlmu+ final state are shown in Fig. B-11, 
B-12, and B-13. These serve as the dominant control regions determining these fake 
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Figure B-3: The relative fake rate for jets to fake electrons in the plug as a function 
of the pt of the jet 
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Figure B-4: The relative fake rate for jets to fake electrons as a function of detEta. 
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Figure B-5: The relative fake rate for jets to fake electrons as a function of phi. 
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Figure B-6: Electron px distribution in the le+lj final state. 
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Figure B-7: Electron detector eta distribution in the le+lj final state. 
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Figure B-8: Electron phi distribution in the le+lj final state. 
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Figure B-9: The relative fake rate for jets to fake muons as a function of pt- 
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Figure B-10: The relative fake rate for jets to fake muons as a function of r/det- 
rates. 

Figures B-14, B-15, and B-16 show the jet to photon fake rates as functions of 
Pt, Vdet, and 0. Detector geometry features are analogous to those exhibited in the 
jet to electron fake rate. The photon p^, ?7det; and distributions in the Ijlph final 
state are shown in Fig. B-17, B-18, and B-19. This is one of the dominant control 
regions determining the jet to photon fake rates. Unlike the previous two cases, this 
final state is dominated by real 7+jet production, rather than the fake process, which 
contributes about 35% to this final state. 

The variation in jet faking b-jet rate is shown in B-20, as a function of pt- This 
shape is consistent with the one measured by the b-tagging group. Before comparing 
absolute values, however, it should be noted that this Vista fake rate includes contri- 
butions from charm quarks to fake b, which is not usually included in the b-tagging 
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Figure B-11: Muon pt distribution in the ljlmu+ final state. 
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Figure B-14: The relative fake rate for jets to fake photons as a function of pr- 
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Figure B-15: The relative fake rate for jets to fake photons as a function of ?7det- 
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Figure B-16: The relative fake rate for jets to fake photons as a function of • 
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Figure B-17: Photon pt distribution in the Ijlph final state. 
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Figure B-18: Photon rj^et distribution in the Ijlph final state. 
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Figure B-19: Photon distribution in the Ijlph final state. 
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Figure B-20: The relative fake rate for jets to fake b — tagged jets as a function of pr- 
It is essentially the mistag rate. 
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Figure B-21: The 6-jet pj- distribution in the Iblj low final state. 



mistag rate. When we accounted for the expected relative contribution of charmed 
quarks in our 'denominator jets', we found values consistent with the mistag rates. 
The b jet "Pt distribution is shown in Fig. B-21 and B-22, for the Iblj high and 
Iblj low final states. These are the dominant control regions determining the 

mistag rates. 

The jet to r relative fake rate is given in Fig. B-23. This shape is then multiplied 
by the function exp(— GeneratedSumPt/350 GeV) and the jet to r fake rate correction 
factor to obtain the final fake rate. The shape is consistent with previous studies of 
the jet to r fake rate. The r "Pt distributions in the 1 j ltau+ low-^ p^, 1 j ltau+ high- 
^Pr, and Itau+ltau- final states are shown in Fig. B-24, B-25, and B-26. These 
serve as the dominant control regions determining the jet to r fake rate. 



211 



b j Y,Pi > 400 GeV 



CDF Run II Data 
Other 

Herwig tt : 0.1% 
Pythia jy : 0.3% 
Pythlabj : 16.7% 
Pythia ii : 82.8% 




200 300 
bPT(GeV) 



Figure B-22: The 6-jet pt distribution in the Iblj high-^pr final state. 
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Figure B-23: The relative fake rate for jets to fake rs as a function of pt- 



1j1tau+_sumPtO-400 

CDF Run il Preliminary (2.0 fb"'; 



w 100 



20 



CDF Run II Data 
I Other 

I MadEvent Z(-»ee) jj : 0.1 % 
I Pythia Z{^tt) : 1.2% 
I Pythia bj : 1.9% 
I Pythia jj :96.4% 




40 60 
x+ Pt (GeV) 



80 



Figure B-24: The r px distribution in the 1 j ltau+ low-^ pr final state. 
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jure B-25: The r pt distribution in the ljltau+ high-^pj^ final state. 




Figure B-26: The r pr distribution in the Itau+ltau- final state. 
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Figure B-27: The relative fake rate for jets to fake rs as a function of pt- 
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Figure B-28: The electron pj- distribution in the le+lph final state. 



Figure B-27 shows the relative fake rate for photons to fake electrons as a function 
of ?7dct- Fig. B-28 and B-29 show the electron pt and ?7dct distributions in the le+lph 
final state. This final state is the dominant control region determining the photon to 
electron fake rate. However, this underlying process does not contribute very much 
to the background in this final state and, as a result, the photon to electron fake rate 
is not as well constrained as other fake rates. Fig. B-30 and B-31 show the photon p^ 
and ?7dct distributions in this same final state. As a general comment, this final state 
is a particularly good example of how well-modelled our fake backgrounds are, since 
the background contributing to this final state is a mixture of various different fake 
processes. 
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Figure B-29: The electron ?7det distribution in the le+lph final state. 
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Figure B-30: The photon pt distribution in the le+lph final state. 
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Figure B-31: The photon rj^ct distribution in the le+lph final state. 
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Code 


Category 


Explanation 


Before 


After 


deviation (ct) 


Change(%) 


5001 


luminosity 


CDF integrated luminosity 


0.927 


± 


0.02 


2 ± 


0.0608 


53.4 


115.3 


5102 


k-factor 


cosmic_ph 


0.69 


± 


0.05 


0.81 ± 


0.05 


2.5 


18.4 


5103 


k-factor 


cosmic_j 


0.45 


± 


0.014 


0.19 ± 


0.014 


-18.2 


-57.1 


5121 


k-factor 


I'ylj photon-|-jet{s) 


0.95 


± 


0.04 


0.92 ± 


0.04 


-0.7 


-2.9 


5122 


k-factor 


l72j 


1.2 


± 


0.05 


1.3 ± 


0.05 


1.3 


5.3 


5123 


k-factor 


l73j 


1.5 


± 


0.07 


1.6 ± 


0.07 


1.5 


7.0 


5124 


k-factor 


l74j + 


2 


± 


0.16 


1.9 ± 


0.14 


-0.5 


-3.9 


5130 


k-factor 


2'~f0j diphoton(+jcts) 


1.8 


± 


0.08 


1.6 ± 


0.07 


-2.4 


-10.5 


5131 


k-factor 


27lj 


3.4 


± 


0.24 


3 ± 


0.17 


-1.8 


-12.8 


5132 


k-factor 


272j + 


1.3 


± 


0.16 


1.2 ± 


0.09 


-0.6 


-7.7 


5141 


k-factor 


WOj W (+jets) 


1.5 


± 


0.027 


1.4 ± 


0.04 


-2.8 


-5.2 


5142 


k-factor 


Wlj 


1.1 


± 


0.03 


1.3 ± 


0.04 


9.1 


25.7 


5143 


k-factor 


W2j 


1 


± 


0.03 


2 ± 


0.06 


32.0 


94.1 


5144 


k-factor 


W3j + 


0.76 


± 


0.05 


2.1 ± 


0.07 


26.9 


177.4 


5151 


k-factor 


ZOj Z {+jcts) 


1.4 


± 


0.024 


1.4 ± 


0.03 


-1.3 


-2.2 


5152 


k-factor 


zij 


1.2 


± 


0.04 


1.2 ± 


0.04 


1.3 


4.5 


5153 


k-factor 


Z2j + 


1 


± 


0.05 


1 ± 


0.04 


-0.3 


-1.5 


5161 


k-factor 


2j pr<150 dijet 


0.96 


± 


0.022 


1 ± 


0.031 


1.9 


4.4 


5162 


k-factor 


2j 150<PT 


1.3 


± 


0.028 


1.3 ± 


0.04 


2.9 


6.5 


5164 


k-factor 


3j P7-'<150 multijet 


0.92 


± 


0.021 


0.94 ± 


0.03 


1.0 


2.3 


5165 


k-factor 


3j 150<!3r 


1.4 


± 


0.032 


1.5 ± 


0.05 


3.7 


8.7 


5167 


k-factor 




0.99 


± 


0.025 


1.1 ± 


0.04 


3.0 


7.7 


5168 


k-factor 


4j 150<!5r 


1.7 


± 


0.04 


1.9 ± 


0.07 


5.5 


12.8 


5169 


k-factor 


5j low 


1.3 


± 


0.05 


1.3 ± 


0.06 


1.7 


6.8 


5170 


k-factor 


lb2j 150<P7- hcavyflavor multijet 


NA 


± 


NA 


2.2 ± 


0.12 


NA 


NA 


5171 


k-factor 


IbSj 150<pr 


NA 


± 


NA 


3 ± 


0.16 


NA 


NA 


5211 


misid 


p(c — >c) central 


0.99 


± 


0.006 


0.98 ± 


0.007 


-1.5 


-0.9 


5212 


mis Id 


p(c^c) plug 


0.93 


± 


0.009 


0.97 ± 


0.007 


3.6 


3.5 


5213 


misId 


p(^— CMUP + CMX 


0.85 


± 


0.008 


0.89 ± 


0.007 


5.3 


5.0 






p(^"f — ^7) central 


0.97 




0.018 


0.95 ± 


0.013 






5217 


mis Id 


p(7~*7) plug 


0.91 




0.018 


0.85 ± 


0.007 


-3.2 


-6.4 


5219 


misId 














-0.8 


-3.2 


5246 


misId 


p(7 — ''^) plug 


NA 


± 


NA 


0.062 ± 


0.0021 


NA 


NA 


5256 


misId 


p(q — vc) central 


9.71 X 10^^ 


± 


1.9 X 10~® 


7.077x10"^ ± 


1x10"^ 


-13.9 


-27.1 


5257 


misId 


p(q — *e) plug 


0.0008761 




1.8x10^^ 


0.0007611 it 


5x 10"*^ 


-6.4 


-13.1 


5261 


mis Id 


p(q — 


1.157x10^'^ 


± 


2.7x 10"'' 


1.235x10"^ ± 


5x10"'' 


2.9 


6.7 


5266 


misId 


p(b^fi) 


NA 


± 


NA 


3.522x10"^ ± 


1.1x10"'^ 


NA 


NA 


5273 


misId 


p(j— b) 25<pr 


0.0168 


± 


0.00027 


0.0183 ± 


0.00018 


5.4 


8.7 


5285 


misId 


p(q-.T) 


0.0034 


± 


0.00012 


0.0052 ± 


8xl0~^ 


14.9 


52.5 


5292 


misId 


p(q — >7) central 


0.0002651 


± 


1.5x10"^ 


0.0002611 ± 


1.2x10"^ 


-0.3 


-1.5 


5293 


misId 


p(q-^7) plug 


0.00159 


± 


0.00013 


0.000478 ± 


4xl0~^ 


-8.6 


-70.0 


5402 


trigger 


p(e— ttrig) plug, pr>25 


0.83 


± 


0.015 


0.86 ± 


7x10"^ 


1.8 


3.2 


5403 


trigger 


p(n^trig) CMUP+CMX, pj,>25 


0.917 


± 


0.007 


0.918 ± 


0.004 


0.2 


0.1 



Table B.13: Comparison of correction factors that were used also in the first 
0.927 fb~^. The Luminosity is in units of fb~^. 



B.5 Correction Factors 



B.5.1 Comparison with first round 

The correction factor values obtained in the second round (v02) (corresponding to 
2 fb~^) are here compared with the correction factor values obtained in the first round 
(vOl) (corresponding to 927 pbb'^). The numerical values can be found in Table B.13; 
analysis of the changes is provided below. 



5001 . The integrated luminosity of the sample has of course increased with respect 
to vOl. The present integrated luminosity obtained from the fit is again consistent 
with the luminosity obtained from the CLC measurement. 
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Figure B-32: Profiles of tlie function at its minimum along each correction factor axis. This array of figures is used as a 
debugging tool to validate the parabolic form of the minimum and the calculation of the error matrix. The top left pane shows 
as a function of integrated luminosity (correction factor code 0001), holding all remaining correction factors fixed; remaining 
panes show profiles along each of the other correction factors. One tick on the horizontal axis of the i^^ pane corresponds to 
5 Si, the obtained error on the correction factor value. One tick on the vertical axis corresponds to one unit of y^. 



Code 


Pull Apart 


Contributions 


0042 


10.3 


( c6pmiss — -3.2 , e6jjpmiss — 2 , c6jj — 1.3 , e6j — -1.1 ) 


0037 


10.2 


( jtau2 = -1 , phOtau = 0.9 , jtaul — -0.9 , ph6tau = 0.8 ) 


0036 


9.2 


( bj5 = 1.5 , b5j = -1.3 , beO - 0.8 , b5jj = -0.5 ) 


0013 


8.9 


( c6jjpmiss — 2.9 , eOjjpmiss — -1.1 , jjmuOpmiss — -0.7 , jjmuO — -0.5 ) 


0031 


8.8 


( bbj = -0.9 , bjS = 0.7 , beO = 0.5 , bbjj = -0.5 ) 


0034 


8.5 


( e6ph6 = -2.6 , e6jj = 0.8 , e6ph0 = 0.8 , e6j = -0.7 ) 


0001 


8.4 


( eOj — -0.5 , e6pmiss — -0.4 , eOpmiss = 0.4 , e6jjpmiss — 0.4 ) 


0033 


8.2 


( eOj = -4.7 , eOjj = 1.3 , beO = 1 , eOjjj = 0.4 ) 


0018 


7.7 


( jj = 1.4 , eOj = -1.3 , e6j = -0.9 , bj = 0.6 ) 


0034 


7.2 


( e6jj = 2.3 , e6j = -1.6 , be6 = -0.9 , bej = 0.4 ) 


0014 


6.9 


( oOjjjpmiss — -1 , jjjmuO — -1 , e6jjjpmiss — 1 , jjjmuOpmiss — -0.6 ) 


0020 


6.2 


( jjj = -2.1 , e6jj = 1.3 , eOjj = 0.5 , bej = 0.3 ) 


0004 


6 


( jphO = 2.2 , oOj = -1.4 , bphO = -0.6 , beO = 0.5 ) 


0026 


5.7 


( c6pmiss — -1.3 , e6e6 — 1.3 , e6jjpmiss — 1 , e6e6j — -0.6 ) 


0012 


5.3 


( c6jjpmiss — 0.7 , jmuOpmiss — 0.6 , beOpmiss — -0.6 , eOjpmiss — 0.5 ) 


0016 


5.1 


( c6e6j — -1.5 , jmuOmuO — 0.5 , constraints = 0.4 , eOjj — 0.27 ) 


0030 


5 


( constraints — 1.4 , e6ph6 — -1 , mu0pli6pmiss = -0.4 , ph6tau — 0.27 ) 


0017 


4.9 


( o6o6jj = 0.5 , jjjmuO = -0.4 , e6e6jjj = -0.27 , e6e6j = -0.24 ) 


0029 


4.8 


( jphO — 0.8 , constraints — 0.5 , jjphO — -0.5 , bphO — -0.4 ) 


0005 


4.5 


( jjphO = -2.3 , eOjj = 0.7 , bjphO = -0.3 , e6jj = 0.24 ) 


0040 


4 


( ph6tau = 1.3 , e6pli6 = -0.9 , ph0ph6 = -0.3 , j5pli6 = -0.28 ) 


0038 


4 


( bmuO = 0.9 , jjmuO = -0.8 , jjjmuO = -0.7 , bjjmu = -0.5 ) 


0039 


3.7 


( jphO = 1.5 , jjphO = -0.5 , bphO = -0.4 , e6ph0 = 0.22 ) 


0025 


3.5 


( eOpmiss = 0.9 , eOeO = -0.8 , oOj = -0.2 , eOjjpmiss = -0.19 ) 


0015 


3.3 


( eOeO = -0.7 , e6e6 — 0.6 , e0e6 = -0.4 , constraints = 0.3 ) 


0006 


3.1 


( jjjphO = -1.9 , eOjjj = 0.6 , bjjphO = 0.16 ) 


0007 


3.1 


( jjjjphO = -2.1 , eOjjjj = 0.6 , e6jjjj = -0.13 ) 


0022 


3 


( bjjj = 0.6 , e6jjj = -0.4 , eOjjj = 0.3 , jjjmuO = -0.3 ) 


0019 


2.9 


( bj5 = 1.3 , b5j = -1 , bb5 = -0.14 , jj5 = -0.13 ) 


0035 


2.9 


( jmuO = 1.1 , jjmuO = -0.6 , jjjmuO = -0.5 , bmuO = 0.4 ) 


0010 


2.6 


( jjjphph = -0.9 , jjphph = 0.4 , e0jjph6 = 0.23 , o6jjjph6 = 0.17 ) 


0021 


2.2 


( jjj5 = 1.2 , b5jj = -0.6 , jjj5ph0 = -0.12 ) 


0024 


2.1 


( c6jjjj = -0.5 , eOjjjj = 0.5 , jjjjj = -0.4 , bjjjj = 0.24 ) 


0011 


2 


( eOpmiss — 0.6 , e6pmiss — -0.6 , muOpmiss — -0.25 , constraints — -0.19 ) 


0027 


2 


( muOpmiss — -0.5 , jmuOpmiss — 0.16 , jjmuOpmiss — -0.13 , jjjmuO — -0.12 ) 


0043 


1.8 


( muOpmiss — -0.7 , constraints — 0.17 , jmuOpmiss — 0.16 , jjjmuO — -0.16 ) 


0025 


1.7 


( b5jj = -0.7 , bjj5 = 0.31 , bb5j = 0.29 , jjjS = 0.15 ) 


0008 


1.6 


( ph0ph6 = -0.5 , e6ph0 = 0.3 , constraints = 0.24 , phOphO = -0.22 ) 


0026 


1.2 


( bbjj5 = -0.5 , b5jjj = 0.29 , bb5jj = 0.25 ) 


0023 


1.1 


( jjjjS = -0.6 , b5jjj = 0.3 ) 


0002 


0.7 


( j5ph0 = -0.11 ) 


0009 


0.6 


( e6jpli0 — 0.19 , constraints — 0.16 , oOjphO — -0.11 ) 



Table B.14: Correction factor pull apart table, intended to show which correction 
factors are being pulled in different directions. Letting xl denote the k^^ term in the 
sum, and Sj the i^^ correction factor, the pull of the k^^ bin on the i^^ correction 
factor is denoted puU^j. Intuitively, bin k "pulls" on the i^^ correction factor with a 
strength of puU^jj. More precisely, the value obtained by the i^^ correction factor is 
pull;i.j standard deviations away from where it would be in the absence of the /c*^ bin. 
If bin k pulls the i^^ correction factor toward larger values, puU^j is positive; if bin k 
favors smaller values of the i*^ correction factor, pull^^j is negative. The units of puU^jj 
are units of x^- The correction factors are sorted in order of decreasing pull apart, 
where the pull apart of the i^^ correction factor is defined as pullApartj = |pull;i.j|, 
provided in the second column. Intuitively, a correction factor has large pull apart 
if some bins strongly favor a larger value, and some bins strongly favor a smaller 
value. In the third column between parentheses are the bins k that contribute most 
to the pull apart of each correction factor, along with each individual contribution 
pull^.j. In each line, only the four largest contributions with pull > 0.1 are listed. In 
the bin labels, a following a particle specifies its centrality; a 4 following a particle 
indicates it has px > 200 GeV; a 5 following mu indicates it is a CMX muon in the 
region 0.6 < \ri\ < 1.0; a 10 following a particle indicates it lies in the plug region 
1 < |?7| < 2.5; constraints specifies the contribution from Xconstraints • 
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Bin 



Total Influence Individuallnfluence 



cOj 


8.9 


( 0033 




-4.7 


, 0004 


= -1.4 , 0018 


= -1.3 , 0001 = -0.5 ) 




8.1 


( 0013 




2.9 


0042 


= 2 , 0026 = 1 


, 0012 = 0.7 ) 




6.6 


( 0034 




2.3 


0042 


= 1.3 , 0020 = 


1.3 , 0034 = 0.8 ) 


eGpmiss 


6.1 


( 0042 




-3.2 


, 0026 


= -1.3 , 0011 


= -0.6 , 0034 = -0.5 ) 


c6pli6 


5.5 


( 0034 




-2.6 


, 0030 


= -1 , 0040 = 


-0.9 , 0034 = -0.21 ) 


inViO 


5.2 


( 0004 




2.2 


0039 


= 1.5 , 0029 = 


0.8 , 0018 = 0.4 ) 




5 


( 0034 




-1.6 


, 0042 


= -1.1 , 0018 - 


= -0.9 , 0034 = -0.7 ) 


constraints 


4.7 


( 0030 




1.4 


0029 


= 0.5 , 0016 = 


0.4 , 0015 = 0.3 ) 


bj5 


3.9 


( 0036 




1.5 


0019 


= 1.3 , 0031 = 


0.7 , 0001 = 0.28 ) 


beO 


3.8 


( 0033 




1 , 0036 = 


0.8 , 0031 = 0.5 , 0004 = 0.5 ) 


jjphO 


3.7 


( 0005 




-2.3 


, 0039 


= -0.5 , 0029 


= -0.5 , 0020 = -0.2 ) 


1 1 1 inii n 

JJJ'''-'-'-^ 


3.7 


( 0014 




-1 , 


0038 = 


-0.7 , 0035 = 


-0.5 , 0017 = -0.4 ) 


eOjj 


3.6 


( 0033 




1.3 


0005 


= 0.7 , 0020 = 


0.5 , 0016 = 0.27 ) 


be6 


3.3 


( 0034 




-0.9 


, 0042 


= -0.6 , 0018 


= -0.5 , 0036 = -0.4 ) 




2.9 


( 0038 




-0.8 


, 0035 


= -0.6 , 0013 


= -0.5 , 0020 = -0.2 ) 


b5j 


2.8 


( 0036 




-1.3 


, 0019 


= -1 , 0031 = 


-0.3 , 0001 = -0.2 ) 


pliGtau 


2.8 


( 0040 




1.3 


0037 


= 0.8 , 0030 = 


0.27 , 0042 = 0.2 ) 




2.5 


( 0016 




-1.5 


, 0026 


= -0.6 , 0017 


= -0.24 , 0001 = -0.1 ) 


jjjphO 


2.5 


( 0006 




-1.9 


, 0029 


= -0.2 , 0039 


= -0.17 , 0022 = -0.12 ) 


bpliO 


2.4 


( 0004 




-0.6 


, 0036 


= -0.5 , 0039 


= -0.4 , 0029 = -0.4 ) 


1 1 1 1 oil n 


2.4 


( 0007 




-2.1 ) 








2.4 


( 0020 




-2.1 


, 0001 


= -0.24 ) 




jmuO 


2.3 


( 0035 


_ 


1.1 


0038 


= 0.5 , 0012 = 


0.2 , 0018 = 0.17 ) 


e6jjjpmiss 


2.3 


( 0014 




1 , 0042 = 


0.4 , 0013 = 0.4 , 0026 = 0.21 ) 


bej 


2.3 


( 0034 




0.4 


0020 


= 0.3 , 0031 = 


0.32 , 0036 = 0.3 ) 


eOjjjj 


2.2 


( 0007 




0.6 


0024 


= 0.5 , 0014 = 


0.29 , 0033 = 0.28 ) 


bmuO 


2.2 


( 0038 




0.9 


0035 


= 0.4 , 0036 = 


0.23 , 0031 = 0.16 ) 


bSjj 


2.2 


( 0025 




-0.7 


, 0021 


= -0.6 , 0036 


= -0.5 , 0031 = -0.21 ) 


e6e6 


2.1 


( 0026 




1.3 


0015 


= 0.6 , 0001 = 


0.18 ) 


e6ph0 


2 


( 0034 




0.8 


0008 


= 0.3 , 0039 = 


0.22 , 0029 = 0.21 ) 




2 


( 0006 




0.6 


0033 


= 0.4 , 0022 = 


0.3 , 0014 = 0.21 ) 


eOpmiss 


2 


( 0025 




0.9 


0011 


= 0.6 , 0001 = 


0.4 ) 


eOjjpmiss 


1.9 


( 0013 




-1.1 


, 0012 


= -0.25 , 0025 


= -0.19 , 0014 = -0.12 ) 


beOpmiss 


1.8 


( 0012 




-0.6 


, 0036 


= -0.5 , 0013 


= -0.25 , 0031 = -0.19 ) 


cOeO 


1.8 


( 0025 




-0.8 


, 0015 


= -0.7 , 0001 


= -0.21 ) 


cOjjjpmiss 


1.7 


( 0014 




-1 , 


0013 = 


-0.4 , 0025 = 


-0.12 ) 


pliOtau 


1.7 


( 0037 




0.9 


0004 


= 0.31 , 0029 = 


= 0.2 , 0039 = 0.15 ) 


muOpmiss 


1.7 


( 0043 




-0.7 


, 0027 


= -0.5 , 0011 


= -0.25 , 0001 = -0.2 ) 


jjmuOpmiss 


1.6 


( 0013 




-0.7 


, 0017 


= -0.2 , 0012 - 


= -0.17 , 0043 = -0.14 ) 


bj 


1.6 


( 0018 




0.6 


0031 


= 0.5 , 0036 = 


0.4 ) 



Table B.15: Correction factor influence table. Letting xl denote the k^^ term in the 
sum and Si the i^^ correction factor, the pull of the i^^ bin on the k^^ correction 
factor is denoted puU^jj. The total influence of a bin k is deflned as totallnfluencefc = 
Y^- IpuU^jJ. Intuitively, bins with large total influence are "important" in influencing 
the position of the minimum. Bins with large total influence tend to be big 
(containing many data events), pull on many correction factors, and prefer correction 
factors values signiflcantly different from the values they would otherwise assume. 
Bins in this table are sorted in order of decreasing total influence, provided in the 
second column. In the third column between parentheses are the correction factors Si 
that are most influenced by the bin. The extent to which these correction factors are 
influenced is also shown in the third column, with an entry such as 0001 = -0.65 
indicating correction factor code 0001 feels a pull of —0.65. In each line, only the 
four largest contributions with pull > 0.1 are listed. Due to the multiplicative nature 
of the correction factors, the pull on each correction factor from bin k is typically 
negative if the Standard Model prediction exceeds the number of data events in bin 
k, and positive if the Standard Model prediction falls short of the data in bin k. 
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5001 


5102 


5103 


5121 


5122 


5123 


5124 


5130 


5131 


5132 


5141 


5142 


5143 


5144 


5151 


5152 


5153 


5161 


5162 


5164 


5165 


5167 


5168 


5169 


5170 


5171 


5211 


5212 


5213 


5216 


5217 


5219 


5246 


5256 


5257 


5261 


5266 


5273 


5285 


5292 


5293 


5402 


5403 


001 


1 




" 


45 




84 




6 




6 




53 




34 




47 




4 




33 




97 




9 




79 




64 




88 




68 




48 




96 




99 




96 




94 




92 




86 




75 




49 




53 




63 




61 




58 




07 




17 




07 




15 




08 


+ 


01 









02 


+ 


03 




04 


+ 


01 


+ 


01 


+ 


02 




03 


102 




45 


1 




+ 


37 


+ 


46 


+ 


49 


+ 


44 


+ 


28 


+ 


48 


+ 


4 


+ 


3 


+ 


43 


+ 


4 


+ 


36 


+ 


25 


+ 


39 


+ 


31 


+ 


2 


+ 


43 


+ 


44 


■ 


43 


+ 


42 


■ 


41 


■ 


39 


■ 


34 


+ 


22 


+ 


24 


+ 


3 


+ 


28 


+ 


28 




36 


+ 


02 


+ 


04 


■ 


09 




18 




06 


" 


01 


+ 


02 




01 


- 


02 




04 




12 







+ 


03 


103 




84 


+ 


37 


1 




+ 


5 


+ 


5 


+ 


44 


+ 


28 


+ 


39 


+ 


34 


+ 


27 


+ 


81 


+ 


75 


+ 


66 


+ 


53 


+ 


73 


+ 


57 


+ 


4 


+ 


8 


+ 


82 


+ 


8 


+ 


78 


+ 


77 


+ 


72 


+ 


63 


+ 


41 


+ 


44 


+ 


53 


+ 


51 


+ 


49 


■ 


06 


+ 


14 


+ 


06 


+ 


13 


■ 


07 




01 







+ 


01 




02 


+ 


03 




01 




01 




01 


+ 


03 


121 




6 


+ 


46 


+ 


5 


1 




+ 


92 


+ 


8 


+ 


54 


+ 


75 


+ 


57 


+ 


43 


+ 


58 


+ 


53 


+ 


47 


+ 


38 


+ 


52 


+ 


41 


+ 


28 


+ 


56 


+ 


59 


+ 


58 


+ 


57 


+ 


65 


+ 


52 


+ 


46 


+ 


29 


+ 


31 


+ 


39 


+ 


38 


+ 


37 




4 


+ 


02 


+ 


07 


+ 


12 




6 




13 


+ 


01 









05 




05 




61 




29 




02 


+ 


02 


122 




6 


+ 


49 


+ 


5 


+ 


92 


1 




+ 


79 


+ 


54 


+ 


77 


+ 


58 


+ 


44 


+ 


58 


+ 


53 


+ 


47 


+ 


38 


+ 


52 


+ 


41 


+ 


28 


+ 


58 


+ 


59 


+ 


56 


+ 


57 


+ 


55 


+ 


52 


+ 


46 


+ 


3 


+ 


32 


+ 


39 


+ 


38 


+ 


37 




46 


+ 


01 


+ 


06 


+ 


13 




57 




13 




01 


+ 


02 




04 




04 




51 




28 




02 


+ 


02 


123 




53 


+ 


44 


+ 


44 


+ 


8 


+ 


79 


1 




+ 


48 


+ 


69 




53 


+ 


37 


+ 


51 


+ 


47 


+ 


41 


+ 


32 


+ 


46 


+ 


36 


+ 


24 


+ 


51 


+ 


52 


+ 


51 


+ 


49 


+ 


44 


+ 


46 


+ 


4 


+ 


26 


+ 


28 


+ 


35 


+ 


33 


+ 


33 




43 


+ 


01 


+ 


04 


+ 


11 




5 




12 




02 


+ 


02 




03 




03 




43 




24 




02 


+ 


02 


124 




34 


+ 


28 


+ 


28 


+ 


54 


+ 


54 


+ 


48 


1 




+ 


47 


+ 


36 


+ 


26 


+ 


32 


+ 


3 


+ 


27 


+ 


2 


+ 


29 


+ 


24 


+ 


14 


+ 


33 


+ 


33 


+ 


33 


+ 


32 


+ 


3 


+ 


29 


+ 


15 


+ 


17 


+ 


18 


+ 


22 


+ 


21 


+ 


21 




31 







+ 


04 


+ 


07 




35 




08 




01 


+ 


01 




02 




02 




3 




17 




01 


+ 


01 


130 




47 


+ 


48 


+ 


39 


+ 


75 


+ 


77 


+ 


69 


+ 


47 


1 




+ 


63 


+ 


49 


+ 


44 


+ 


41 


+ 


36 


+ 


28 


+ 


4 


+ 


32 


+ 


21 


+ 


45 


+ 


46 


+ 


45 


+ 


44 


+ 


43 


+ 


4 


+ 


35 


+ 


23 


+ 


25 


+ 


32 


+ 


29 


+ 


3 




63 




14 


+ 


04 


+ 


18 




48 




17 




01 


+ 


02 




02 


+ 


05 




27 




38 


+ 


01 


+ 


02 


131 




4 


+ 


4 


+ 


34 


+ 


57 


+ 


58 


+ 


53 


+ 


36 


+ 


63 


1 




+ 


3 


+ 


39 


+ 


36 


+ 


31 


+ 


25 


+ 


35 


+ 


27 


+ 


19 


+ 


39 


+ 


4 


+ 


39 


+ 


38 


+ 


37 


+ 


36 


+ 


3 


+ 


2 


+ 


21 


+ 


26 


+ 


24 


+ 


25 




51 




13 


+ 


04 


+ 


15 




33 




12 




01 


+ 


01 




02 


+ 


04 




13 




24 







+ 


02 


132 




33 


+ 
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+ 


27 


+ 


43 


+ 


44 


+ 


37 


+ 


26 


+ 


49 


+ 
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+ 


31 


+ 


29 


+ 


25 


+ 
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+ 


28 


+ 


22 


+ 


13 


+ 


31 


+ 


32 


+ 


31 


+ 


31 
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+ 


28 


+ 


24 


+ 


16 


+ 


17 


+ 


21 


+ 
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+ 


2 




4 




16 


+ 


03 


+ 


13 




23 




09 




01 


+ 


01 




01 


+ 


02 




06 




12 







+ 


02 


141 




97 
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Table B.16: Correction factor correlation matrix. The topmost row and leftmost column show correction factor codes. Each 
element of the matrix shows the correlation between the correction factor labeling the element's column and the correction 
factor labeling the element's row. Each matrix element is dimensionless; the elements along the diagonal are unity; the matrix 
is symmetric; positive elements indicate positive correlation, and negative elements anti-correlation. 
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5102. This cosmic photon "/c-factor" has increased due to requiring that this back- 
ground satisfies the same good run hst that are required for the data and by requiring 
that these events contain at least one reconstructed photon. As a result the number 
of events in this background has been decreased prompting this /c-factor to increase 
accordingly. 

5103. This cosmic jet "fc-factor" has decreased due to the cut on the second jet 
in jet final states, as described in Sec. 4.2.3. The cut removes events in which the 
leading jet is due to a cosmic ray, and the other jets are due to the underlying event. 
As a result of this removal, the kfactor for this background has been reduced. 

5121 — 5132. The /c-factors for photon + jet production and diphoton production 
is consistent with values obtained in vOl. 

5151 — 5153 . The /c-factors for Z + jet production is consistent with values obtained 
in vOl. 

5141 — 5144. Motivated by a mistake in the modelling of the inoperational period 
of the keystone and miniskirt portions of the muon detector, we switched from the 
MadEvent W+jets Monte Carlo sample to the standard Top Group Alpgen W+jets 
sample. These /c-factors were changed to correspond to Alpgen cross sections. 

5161 — 5169. In vOl of this analysis we used p{j — > j) = 1, despite the fact that 
p{j ^ b) > 0.01. It is logically more consistent to chose p{j ^ j) = 1 — p{j b), 
so this is what is done in v02. The result of this modification is that /c-factors for 
processes with one or more jets have increased. 

5170,5171. These two /c-factors for heavy fiavor multijet production have been 
introduced. 
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5211,5212. The central electron identification efficiency is consistent with value 
obtained in vOl. The phoenix electron identification efficiency scale factor has changed 
refiecting our change to the phoenix electron identification criteria. 

5213 . The muon identification efficiency scale factor has changed due to our change 
to the muon identification criteria, and the correction to the modelling of the inoper- 
ational period of the keystone and miniskirt portions of the muon detector. 

5216,5217,5219. The identification efficiencies ^(7 7) in the central and plug 
regions, and p{b b) in the central region are consistent with values obtained in vOl. 

5245 . The fake rate p{e — 7) has been removed after the change to the plug electron 
and photon identification. It was found to be unnecessary. This vanished correction 
factor is not listed in Table B.13. 

5246 . The fake rate ^(7 e) in the plug has been promoted to a correction factor 
from a fixed value of 0.005. This value increased significantly due to a redefinition 
of plug photons into electrons in the le+lph final state. This was motivated by the 
fact that this plug photon was much more likely to have been an electron. We have 
removed this renaming procedure for the current version of the analysis. 

5256 , 5257 . The fake rates p{q — > e) in the central and plug regions have decreased 
by roughly 13% and 6%, respectively, due to our improved conversion removal. In 
vOl we required a candidate conversion track to have > 2 GeV; in v02 we make no 
transverse momentum requirement on the candidate converstion track. The change to 
the fake rate in the plug region is also affected by our change to the phoenix electron 
identification. 

5261 . The fake rate p{q — fi) is consistent with the value obtained in vOl. 
5273. The fake rate p{j — * b) is consistent with the value obtained in vOl. 
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5285 . A different pt dependence has been imposed for the fake rate p{q r) in v02 
than apphed in vOl and a dependence on the generated sumPt has also been apphed. 
As a resuh of not being careful about proper normalizations of those functions, this 
number is not directly comparable to the one from vOl. 

5292 . The value obtained for the fake rate p{q 7) in the central region is consis- 
tent with the value obtained in vOl. 

5293 . The fake rate p{q — > 7) in the plug has decreased to due our correction to 
the plug photon identification criteria. 

5401 . The central electron trigger efficiency has been found to increase to unity in 
the current version of the analysis, because we now allow an event to pass on any 
online trigger. As a consequence, it is no longer appropriate to constrain this trigger 
efficiency to the Joint Physics value for the CEM trigger. We now simply fix the 
central electron trigger efficiency to 1.0 and it is no longer a correction factor. This 
vanished correction factor is not listed in Table B.13. 

5402. The plug electron trigger efficiency is consistent with the value from vOl. 

5403. We have combined the CMUP and CMX trigger efficiencies due to the fact 
that they were very close to each other from vOl of the analysis. The value in v02 of 
the analysis is consistent with the values from vOl. 
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Appendix C 

Risk of Being Ad Hoc 



C.l Introduction 

Here follows a general discussion, not so much about the actual SM implementation 
in this analysis, but about concerns such as bias, not being "blind", and how these 
factors affect the meaning of a null result. 

In a search for new physics, especially a model-independent one, it is necessary to 
construct the Standard Model (SM) prediction. Then, one can test whether the data 
(D) are consistent with it. 

By definition, the data follow the true law of nature. Denote the true theory by 
T. If there is physics beyond the SM, then T ^ SM. If new physics is to be observed, 
the p.d.f. of at least one observable quantity needs to differ adequately from that 
predicted by the SM. 

Having the data events distributed according to T, one has the freedom to test 
their consistency with any conceivable theory. However, what is really interesting, 
is how well the data agrees with the SM, rather than some arbitrary model, not 
necessarily well motivated. We could, for example, construct a model agreeing bin by 
bin with the data. Imagine for instance having a dedicated /c-factor^ per final state; 

^fc-factors are corrections to the cross sections of processes. Typically, cross sections are calculated 
to leading-order, or next-to-leading-ordcr, and rarely to an even higher order, fc-factors are meant 
to correct such approximate calculations to the infinite-order cross section, which is incalculable, 
therefore /c-factors are inferred from the data. 
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I *Mrt 

Figure C-1: Simplified picture of tlie p.d.f.s of tlie true theory and several possibilities 
for the SM implementation. 

then we would be able to adjust this elastic pseudo-theory to match any combination 
of populations across final states. That data-obeying model would be consistent 
with T. Then, by construction, testing the quality of the fit would confirm the null 
hypothesis, namely that data agree with the constructed model. The hypothesis test 
itself would be perfectly legitimate, and its outcome would be correct, yet completely 
uninteresting, since nobody is interested in that absurd model anyway. 

The problem then begins with the realization that the truly interesting hypothesis, 
the SM, is itself not known exactly; one needs information about correction factors, 
such as fake rates, /c-factors, efficiencies etc. Different values of such parameters result 
in different "SM" predictions. 

Let's assume there are only two observable quantities, Ai and A2 (Fig. C-1). For 
example, Ai and A2 could be the populations of events in two final states. Depending 
on the values of some correction factors (like fc-factors etc.), the prediction of the SM 
implementation can be centered anywhere in some locus. In this case, the allowed 
locus is represented by a one- dimensional solid line; in general, the locus may be 
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higher-dimensional. 

The correction factors have some true values, which may be unknown. The true 
Standard Model prediction is located at the "SM" point, which corresponds to the 
true values of the correction factors. Ideally, that is the SM we would like to compare 
to the data. 

When the work to construct the SM prediction begins, one has no adjustments 
made yet, which results in some prediction centered on, say, point SMq. One sees 
then the data^, which are by definition near point T, and notices the discrepancies in 
Ai and A2. Since he has applied no corrections yet, he can not be confident that the 
current prediction is the real SM. The SM has been successful so far, therefore to rule 
it out one needs convincing evidence. To be convincing, he needs to be conservative; 
he must exploit any source of systematic uncertainty that he can identify in order to 
correct the prediction in a direction that brings it closer to the data. Unfortunately, 
there is no prescription how to do that correctly. 

There are some obvious sources of uncertainty: fc-factors reflecting the fact that 
it is not possible to calculate the infinite-order cross section of SM processes, uncer- 
tainties in the exact probability by which a particle may be misidentified, uncertainty 
in the integrated luminosity etc. For specific discrepancies that are not accounted 
for by such obvious uncertainties, one needs to become more imaginative to identify 
what may be causing them, but it is important to not invent false corrections. It 
requires judgment to make well motivated adjustments instead of ad hoc corrections 
that hide the signal of potentially new physics. The locus, represented by the solid 
line in Fig. C-1, is meant to represent the possible predictions that can be derived 
by making well motivated corrections, whereas points out of the locus represent the 
results of poorly motivated corrections. 

Suppose that throughout the process one makes well motivated corrections. Then 
his prediction should drift along the locus from point SMq to point M^, which gives 
the best agreement with the observed data in Ai and A2 simultaneously. Even though 
Ma 7^ SM, he will need to stop at Ma and not proceed towards the actual SM point. 

^Whether he sees all, or part, or only some aspect of them will be discussed later. 
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That is because he has no way to know if he has reached the actual SM to stop there; 
his only guidance is the data and his judgment. To be conservative, he would have 
to bring the prediction as close to point T as allowed, but not closer - that is point 
Ma- The wrong thing to do would be to introduce extraneous, poorly motivated 
corrections that would drive one from SMq to a prediction like M^, namely out of 
the locus. That would be the result of ad hoc treatment of discrepancies, which in 
its extreme limit would result in a model as uninteresting as the data-obeying model 
mentioned earlier. 

What can safeguard one from constructing the prediction of some poorly moti- 
vated model? Only prudence and an over-constrained system that limits systematic 
uncertainties, making it harder to deviate from the SM locus. The risk of implement- 
ing an ad hoc model remains, unless all systematic uncertainties shrunk to zero, in 
which ideal case the locus would shrink into just the true SM point. However, there 
are some "blind" approaches that, as will be argued, create the illusion of safety 
against erring, or the sensation that information is generated out of nothing, by using 
the data in "clever" ways, i.e. by not seeing all of them at the same time. 

C.2 Blind to signal region 

In some cases (not in this analysis) one may presume that the new physics will be 
affecting A2 but not Ai. That is clearly an assumption, which in many cases can 
be motivated. A2 is then treated as "signal region", and Ai as "control region". 
Adjusting the correction model to achieve maximal agreement with the data in Ai 
is legitimate, since the premise is that the SM should distribute Ai as T does. That 
leads (if everything is done correctly) to a SM implementation with p.d.f. centered 
on Mfe. 

There is nothing wrong in defining control and signal regions. Clearly, when 
interpreting the result of the comparison of the data with M{, one needs to remember 
that Mfe is not the globally best fitting model (that would be Ma)- Furthermore, 
Mfe is not necessarily the true SM, but is the model that best fits the control region. 
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Indicative of the value of such results is the fact what is "signal" region in one analysis 
can be "control" in another. Depending on what one defines as "signal" and "control" , 
the result may vary from agreement to disagreement with the data. Although these 
results can be valid, they are convincing only if the initial premise is accepted. 

Unfortunately, staying "blind" in A2 does not guarantee that the final model will 
not be an absurd and ad hoc one. For example, a human error may lead one from 
SMq to Md or Mg. Apart from a human error that may occur during the development 
of the correction model, opening the box (e.g. looking at the measured A2) often 
makes people question the correctness of their implemented model, especially in the 
event of a discrepancy with the data. In that phase of reconsideration, one may even 
accidentally change his background model from to Mg, so the notion of "blindness" 
is questionable, unless no discrepancy is seen. Therefore, as in the non-blind analysis 
case, prudence and an over-constrained system that limits systematic uncertainties, 
making it harder to deviate from the SM locus, can prevent testing the goodness of 
a worthless model (like Mg, Mg or Md). 

C.3 Blind to part of the data 

Another approach considered "blind" is to split the whole data set (D) in two parts 
(-^control) -Dsignai), assiguiug for example every third event to -Dcontroi and the rest to 
-Dsignai- Then, -Dcontroi cau be used to develop the correction model, and -Dgignai is only 
revealed in the end, to check how well it is fitted by the derived background model. 

The supposed advantage of this approach is that Z^signai is independent from 
-Dcontroi- So, if agreement is observed between -Dgignai and the background model, 
that supposedly can not be due to a biased model, as the background model was 
developed knowing nothing about -Dgignai- Though psychologically reassuring, this 
impression of safety is false. 

Obviously, all data come from the same distribution T, therefore there is no reason 
why -Dgignai would be distributed any differently than -Dcontroi? apart from random 
statistical fluctuations, which actually become bigger when -Dcontroi and -Dgignai have 
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smaller populations. 

If one makes wrong judgments in the way he uses -Dcontmh then there are two 
possibilities: If one observes agreement between the background model and -Dsignab 
that only means that -Dsignai didn't fluctuate too differently than -Dcontroi- On the 
other hand, if one observes disagreement, that would only be due to (rare) statistical 
fluctuation of -Dsignai with respect to -Dcontroi- In other words, if one makes the wrong 
use of -Dcontroi the result is as uninformative as it would be if he had used the whole 
-D in a wrong way. 

Furthermore, even if one is very prudent and has an over- const rained system with 
small systematic uncertainties, still splitting the data makes the situation worse. 
Having less data in -Dcontroi to constrain the correction factors makes the locus where 
SM could be larger, therefore it is more likely to end up with a correction model farther 
away from the actual SM, simply due to larger systematic uncertainties. Furthermore, 
having a smaller number of data in -Dgignai reduces statistical power, making it harder 
to observe a real effect that may appear in the measured Ai and A2. 

In summary, splitting D in two does not secure one from implementing wrongly his 
theoretical prediction. If one can make proper use of -Dcontroi, then he can also make 
proper use of the whole D, which would offer the advantage of smaller uncertainties. 

C.4 Summary 

To summarize, there is no way to be sure that the null hypothesis compared to the 
data is the SM, rather than some other uninteresting one. However, there is reason to 
hope that what was tested in this analysis is the agreement of the data with a model 
that at least is possible to be the SM, namely belongs to the SM locus determined by 
well motivated systematic uncertainties. Certainly, the tested model is biased to agree 
with the data more than the SM may actually agree^, since the best fitting choice 
of correction parameters was made, but that is inevitable, since the SM is assumed 
correct until proof of the contrary. The hope that the implemented background 
■^Think of the analogy given by points "SM" and Ma in Fig. C-1. 
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model is not far from the actual SM is based on the fact that the correction model 
is significantly over-constrained by examining not just a couple of observables, but 
thousands. After all, human errors are always possible, but the best effort was made to 
eliminate them. Well motivated corrections usually fix several problems at once, while 
mistaken adjustments tend to fix one problem but cause other. Our global approach 
allowed us to distinguish the former from the latter, by monitoring simultaneously so 
many observables before and after the adjustments. 
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Appendix D 
Nomenclature 



BMU 


Barrel Muon system. Often synonymous 


FIFO 


First In First Out 




to IMU 


FNAL 


Fermi National Accelerator Laboratory 




Collider Detector at Fermilab 


GMSB 


Gauge Mediated Supersymmetry Break- 




Charge Exchange Injection 




ing 


CEM 


Central Electromagnetic calorimeter 


GUT 


v:xi OjiiU U iiliiLdiLIUii ±iiUUiy 




Conscil Europcen pour la Recherche Nucleaire ID 


TH OTi"!" 1 Ti r" ci "1" 1 on 


CES 


Central Electromagnetic Showermax de- 


IMU 


Tnfprmprli^if p A/Tnnn tivQf pm 




tector 


ISL 


Intermediate Silicon Layer 


CHA 


Central Hadronic calorimeter 


KS 


Kolmogorov Smirnov 


CKM 


Cabibbo Kobayashi Maskawa 


LOO 


Layer of the Silicon Detector 


CLC 


Cerenkov Luminosity Counter 


LHC 


Large Hadron Collider 


CMP 


Central Muon Upgrade 


LO 


leading order 


CMUP 


A muon that has both CMU and a CMP 
hits 


MC 


Monte Carlo 


CMU 


Central Muon Detector 


MET 


Missing Transverse Energy 


CMX 


Central Muon Extension 


MIP 


IVIinimum Ionizing Particle 


COT 


Central Outer Tracker 


MI 


Main Injector 


CPR 


Central Preshower detector 


PDF 


Parton Distribution Function 


CPU 


Central Processor Unit 


p.d.f. 


Probability Density Function 


CP 


Charge Parity 


PEM 


Plug Electromagnetic calorimeter 


CSL 


Consumer Server Logger 


PES 


Plug Electromagnetic Showermax detec- 
tor 


DAQ 


Data Acquisition 


PHA 


Plug Hadronic calorimeter 


DIS 


Deep Inelastic Scattering 


PHX 


"Phoenix" , referring to forward tracks re- 


EM 


Electromagnetic 




constructed from silicon hits 


EVB 


Event Builder 


PMNS 


Pontecorvo Maki Nakagawa Sakata 


EWK 


Electroweak 


PMT 


Photomultiplier 


EWSB 


Electroweak Symmetry Breaking 


QCD 


Quantum Chromodynamics 


FCC 


Feynman Computing Center 


RF 


Radio Frequency 
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SCPU Scanner CPU 

SM Standard Model of elementary particles 

SUGRA Supergravity 

SUSY Supersymmetry 

SVT Silicon Vertex system 

SVX Silicon Vertex Detector 

TSI Trigger Supervisor 

UV Ultraviolet 

VME Virtual Machine Environment, a standard 
mainframe operating system 

VRB VME Readout Buffer (or Board) 

WHA Endwall Hadronic calorimeter 

WLS Wavelength Shifting (optic fiber) 

XCES Extrapolation to Central Electromagnetic 
Showermax 

XFT Extremely Fast Tracker 

XTRP Extrapolation Unit 



234 



Bibliography 



[1] John F. Donoghue, Eugene Golowich, and Barry R. Holstein. Dynamics of the 
Standard Model, chapter I, II, III. Cambridge University Press, 1994. 

[2] Esteban Roulet. Beyond the Standard Model. 2001, hep-ph/0112348. 

[3] Review of Particle Physics. Physics Letters B, 592:1+, 2004. 

[4] Keith A. Olive. Dark matter. 2003, astro-ph/0301505. 

[5] Chris Quigg. Spontaneous symmetry breaking as a basis of particle mass. Reports 
on Progress m Physics, 70 (7): 10 19-1053, 2007. 

[6] Jogesh C. Pati and Abdus Salam. Unified Lepton-Hadron Symmetry and a 
Gauge Theory of the Basic Interactions. Phys. Rev., D8:1240-1251, 1973. 

[7] H. Georgi and S. L. Glashow. Unity of All Elementary Particle Forces. Phys. 
Rev. Lett, 32:438-441, 1974. 

[8] Stephen P. Martin. A supersymmetry primer. 1997, hep-ph/9709356. 

[9] Ted Thomas Derek Raine. An introduction to the science of Cosmology, chapter 
3.12.3. Institute of Physics Publishing, 2001. 

[10] Nima Arkani-Hamed, Savas Dimopoulos, and G. R. Dvali. The hierarchy problem 
and new dimensions at a millimeter. Phys. Lett., B429:263-272, 1998, hep- 
ph/9803315. 

[11] Thomas Appelquist, Hsin-Chia Cheng, and Bogdan A. Dobrescu. Bounds on 
universal extra dimensions. Phys. Rev., D64:035002, 2001, hep-ph/0012100. 

235 



[12] Lisa Randall and Raman Sundrum. A large mass hierarchy from a small extra 
dimension. Phys. Rev. Lett, 83:3370-3373, 1999, hep-ph/9905221. 

[13] R. Sekhar Chivukula. Technicolor and compositeness. 2000, hep-ph/0011264. 

[14] F. Abe et al. Observation of top quark production in pp collisions. Phys. Rev. 
Lett, 74:2626-2631, 1995, hep-ex/9503002. 

[15] Fermilab's chain of accelerators: The proton source, 
http: / / www-bd.fnal.gov/public/proton.html. 

[16] Fermilab linac upgrade conceptual design revision 4a. FERMILAB-LU- 
CONCEPTUAL-DESIGN. 

[17] Booster rookie book. 

http:/ /www-bdnew. fnal.gov/operations/rookie_books/Booster _V3_l.pdf. 

[18] C. Hojvat et al. The multiturn charge exchange injection system for the fermilab 
booster accelerator, (talk). IEEE Trans. Nucl. Set., 26:3149-3151, 1979. 

[19] Main injector performance goals. 1998. 

http:/ /www-fmi. fnal.gov/Preform%20Goals/Chapter_5.pdf. 

[20] The antiproton source rookie book. 1999. 

http:/ /www-bdnew. fnal.gov/pbar/documents/PBAR_Rookie_Book.pdf. 

[21] B. F. Bayanov et al. The proton beam lithium lens for the fermilab anti-proton 
source. FERMILAB-TM-1000. 

[22] Gerry Jackson. The fermilab recycler ring technical design report, rev. 1.2. 
FERMILAB-TM-1991. 

[23] Dave McGinnis. Run ii handbook. 2001. 
http: / / www-bd.fnal.gov/ runll/index.html. 

[24] Krzysztof Genser and Paul Lebrun. Tevatron Transverse Beam Emittance and 
Luminosity at CDF and DO for stores 3179-3293. 2004. Beams-doc-1075. 

236 



[25] Vaia Papadimitriou. Updated history of CDF/DO luminosity ratio and some 
correlations. 2006. 

http:/ /www-bd. fnal.gov/SDA_Viewer/VaiaLuminosity/vaia_lum_16Aug06_web.ppt. 

[26] Patrick T. Lukens. The CDF lib detector: Technical design report. FERMILAB- 
TM-2198. 

[27] Bart Zeghbroeck. Principles of Semiconductor Devices. 2004. 
ht t p : / / ece- www . Colorado .edu/~b art / book/book/. 

[28] A. Boveia. Status and performance of the CDF Run II silicon detector. PoS, 
HEP2005:377, 2006. 

[29] Joe Incandela, Jeff Spalding, Paul Shepard, Tim Nelson, David Stuart, Maurice 
Garcia-Sciveres, Igor Volobouev, and Steve Worm. The CDF Run II Silicon 
Tracking System. Nucl. Instrum. Metk, A447:l-8, 2000. 

[30] Open-cell chamber to replace the CTC. CDF note 3648, 1996. 

[31] CDF Central Outer Tracker. CDF note 6227 and Nucl.Instru.Meth., A 526, 2004. 

[32] Hans Wenzel. Tracking In The SVX. CDF note 1790, 1992. 

[33] Paolo Gatti. Performance of the new tracking system at CDF II. CDF note 
5561, 2001. 

[34] W.M. Yao and K. Bloom. Outside-in Silicon Tracking at CDF. CDF note 5991, 
2002. 

[35] Yimei Huang, Chris Hays, and Ashutosh Kotwal. Inside-Out Tracking. CDF 
note 6707, 2003. 

[36] F. Abe et al. The CDF detector: an overview. Nucl. Instr. Meth., A271:387-403, 
1988. 

[37] D. Acosta et al. The cdf cherenkov luminosity monitor. Nucl. Instrum. Meth., 
A46 1:540-544, 2001. 

237 



[38] Georgios Choudalakis, Conor Henderson, Khaldoun Makhoul, and Markus Klute. 
CDF Event Builder for the Shift Crew. CDF Note 7846, 2005. 

[39] Markus Klute. CDF Runllb Event Builder: Data Integrity Checks performed 
by the SCPU. CDF note 8021, 2006. 

[40] R. Brun and F. Rademakers. Root: An object oriented data analysis framework. 
Nucl. lustrum. Meth., A389:81-86, 1997. 

[41] T. Aaltonen et al. Model-Independent and Quasi-Model-Independent Search for 
New Physics at CDF. 2007, arXiv:0712.1311 [hep-ex]. 

[42] CDF Collaboration. Model-Independent Global Search for New High-pT Physics 
at CDF. 2007, arXiv:0712.2534 [hep-ex]. 

[43] Georgios Choudalakis. Vista, results of a model-independent search for new 
physics in 927 pb-^ at CDF. 2007, arXiv:0710.2372 [hep-ex]. 

[44] Georgios Choudalakis. Sleuth at CDF, a quasi-model-independent search for new 
electroweak scale physics. 2007, arXiv:0710.2378 [hep-ex]. 

[45] F. Krauss. Matrix elements and parton showers in hadronic interactions. JHEP, 
08:015, 2002, hep-ph/0205283. 

[46] CDF Collaboration. Measurements of inclusive w and z cross sections in p-pbar 
collisions at sqrts =1.96 tev. J. Phys. C, 34:2457, 2007. 

[47] CDF Collaboration. 2007. Phys. Rev. D 75 092004. 

[48] CDF Collaboration. Measurement of the cross section for prompt diphoton pro- 
duction in pp collisions at y/s = 1.96 tev. 2005. Phys. Rev. Lett. 95 022003. 

[49] Robert Craig Group. Measurement of the inclusive jet cross section using 
the midpoint algorithm in run ii at the collider detector at fermilab (cdf). 
FERMILAB-THESIS-2006-29. 



238 



[50] A. Bhatti et al. Determination of the jet energy scale at the colhder detector at 
fermilab. 2006. Nucl. Instrum. Meth. A566 375. 

[51] Christopher Neu. CDF b-tagging: Measuring efficiency and false positive rate. 
2006. Presented at TOP 2006: International Workshop on Top Quark Physics, 
Coimbra, Portugal. 

[52] A.V. Kotwal, H.K. Gerberich, and C. Hays. Identification of cosmic rays using 
drift chamber hit timing. Nucl. Instrum. Meth., A506:110, 2003. 

[53] Torbjorn Sjostrand, Stephen Mrenna, and Peter Skands. Pythia 6.4 physics and 
manual. JHEP, 05:026, 2006, hep-ph/0603175. 

[54] Fabio Maltoni and Tim Stelzer. Madevent: Automatic event generation with 
madgraph. JHEP, 02:027, 2003, hep-ph/0208156. 

[55] G. Corcella et al. Herwig 6.5 release note. 2002, hep-ph/0210213. 

[56] CTEQ Collaboration. Global QCD analysis of parton structure of the nucleon: 
Cteq5 parton distributions. 2000. Eur. Phys. J. C 12 375. 

[57] Rick Field. Cdf run 2 monte-carlo tunes. 2005. Proceedings of TeV4LHC Work- 
shop, Fermilab, Batavia, IL. FERMILAB-CONF-06-408-E. 

[58] Stephen Mrenna and Peter Richardson. Matching matrix elements and parton 
showers with herwig and pythia. JHEP, 05:040, 2004, hep-ph/0312274. 

[59] Nikolaos Kidonakis and Ramona Vogt. Next-to- next-to-leading order soft-gluon 
corrections in top quark hadroproduction. Phys. Rev.., D68:114014, 2003, hep- 
ph/0308222. 

[60] T. Stelzer and W. F. Long. Automatic generation of tree level helicity amplitudes. 
1994. Comput. Phys. Commun. 81 357. 

[61] E. Gerchtein and M. Paulini. Cdf detector simulation framework and perfor- 
mance. 2003, physics/0306031. 

239 



[62] Guenter Grindhammer, M. Rudowicz, and S. Peters. The fast simulation of 
electromagnetic and hadronic showers. Nucl. Instrum. Meth., A290:469, 1990. 

[63] D. Acosta et al. The performance of the cdf luminosity monitor. Nucl. Instrum. 
Meth., A494:57-62, 2002. 

[64] CDF Collaboration. Further properties of high-mass multijet events at the fer- 
milab proton-antiproton collider. 1996. Phys. Rev. D 54 4221. 

[65] CDF Collaboration. Properties of six-jet events with large six-jet mass at the 
fermilab pp collider. 1997. Phys. Rev. D 56 2532. 

[66] DO Collaboration. Transverse energy distributions within jets in pp collisions at 
v/i = 1.8 tev. 1995. Phys. Lett. B 357 500. 

[67] J. Alwall et al. Comparative study of various algorithms for the merging of parton 
showers and matrix elements in hadronic collisions. 2007, arXiv:0706.2569 [hep- 
ph]. 

[68] B. Knuteson. PhD thesis, University of California, Berkeley, 2000. 

[69] D0 Collaboration. Search for new physics in e^X data at D0 using Sleuth: 
A quasi model independent search strategy for new physics. 2000. Phys. Rev. D 
62 092004. 

[70] D0 Collaboration. A quasi-model-independent search for new physics at large 
transverse momentum. 2001. Phys. Rev. D 64 012004. 

[71] D0 Collaboration. A quasi-model-independent search for new high p^ physics 
at D0. 2001. Phys. Rev. Lett. 86 3712. 

[72] HI Collaboration. A general search for new phenomena in e p scattering at hera. 
2004. Phys. Lett. B 602 14. 

[73] CDF Collaboration. Observation of top quark production in pp collisions. 1995. 
Phys. Rev. Lett. 74 2626. 

240 



[74] D0 Collaboration. Observation of the top quark. 1995. Phys. Rev. Lett. 74 
2632. 

[75] CDF Collaboration. Measurement of the top-quark mass in all-hadronic decays 
inpp coUisions at cdf ii. 2007. Phys. Rev. Lett. 98 142001. 

[76] CDF Collaboration and D0 Collaboration. A combination of CDF and D0 
results on the mass of the top quark. 2007. hep-ex/0703034. 

[77] CDF Collaboration. Search for anomalous production of diphoton events with 
missing transverse energy at cdf and limits on gauge- mediated supersymmetry- 
breaking models. 2005. Phys. Rev. D 71 031104. 

[78] A. Abulencia et al. Search for z' —* e~^e^ using dielectron mass and angular 
distribution. 2006. Phys. Rev. Lett. 96 211801. 

[79] CDF Collaboration. Search for resonant ttbar production in ppbar collisions at 
sqrts=1.96 tev. 2007, arXiv:0709.0705 [hep-ex]. 

[80] CDF Collaboration. Search for large extra dimensions in the production of jets 
and missing transverse energy in p anti-p collisions at s**(l/2) = 1.96-tev. 2006. 
Phys. Rev. Lett. 97 171802. 

[81] CDF Collaboration. Inclusive Search for New Physics with Like-Sign Dileptons 
in pp Collisions at = 1-96 TeV. 2006. CDF-8643. 

[82] D. Hare, E. Halkiadakis, T. Spreitzer. Electron ID Efficiency and Scale Factors 
for Winter 2007 Analyses. 2006. CDF-8614. 

[83] Sarah Budd, Matthias Buhler, Catalin Ciobanu, Peter Dong, Richard Hughes, 
Thomas Junk, Kevin Lannon, Jan Lueck, Thomas MuUer, Svenja Richter, Jason 
Slaunwhite, Bernd Stelzer, Jeannine Wagner, Wolfgang Wagner, Rainer Wallny, 
Brian Winer. Event detection efficiency for single-top events and MC based 
background estimate for Summer 2006. 2006. CDF-8286. 

241 



[84] A. Djouadi, J. Kalinowski, and M. Spira. HDECAY: A program for Higgs boson 
decays in the standard model and its supersymmetric extension. Comput. Phys. 
Commun., 108:56-74, 1998, hep-ph/9 704448. 

[85] Zoltan Nagy. Next-to-leading order calculation of three-jet observables in hadron 
hadron coUision. 2003. Phys. Rev. D 68 094002. 

[86] Zoltan Nagy and Zoltan Trocsanyi. Multi-jet cross sections in deep inelastic 
scattering at next-to-leading order. 2001. Phys. Rev. Lett. 87 082001. 

[87] T. Binoth, J. P. Guillet, E. Pilon, and M. Werlen. A full next to leading order 
study of direct photon pair production in hadronic collisions. 2000. Eur. Phys. 
J. C 16 311. 

[88] P. J. Sutton, Alan D. Martin, R. G. Roberts, and W. James Stirling. Parton dis- 
tributions for the pion extracted from drell-yan and prompt photon experiments. 
1992. Phys. Rev. D 45 2349. 

[89] S. Alekhin. Parton distribution functions from the precise nnlo qcd fit. JETP 
Lett, 82:628-631, 2005. 

[90] Daniel Stump et al. Inclusive jet production, parton distributions, and the search 
for new physics. 2003. J. High Energy Phys. 10 046. 

[91] CDF Collaboration. Measurement of the ti production cross section in pp col- 
lisions at ^/s = 1.96 tev using lepton + jets events with secondary vertex 
5-tagging. 2005. Phys. Rev. D 71 052003. 

[92] Andrea Messina. Measurement of the w -|- jet cross section at cdf. 2007, 
arXiv: 0708. 1380 [hep-ex]. 

[93] T. Spreitzer, C. Mills, J. Incandela. Electron Identification in Offline Release 
6.1.2. 2006. CDF-7950. 



242 



